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EXECUTIVE  SUMMARY 


Mathematical  models  are  often  used  to  estimate  the  fate  and  impact  of  organic  chemicals  in  the 
environment.  Use  of  these  models  requires  a  variety  of  parameters  describing  site  and  chemical 
characteristics.  Aqueous  solubility  (S),  the  octanol/water  partition  coefficient  (Kow),  the  organic 
carbon  normalized  soil/water  sorption  coefficient  (Koc),  vapor  pressure  (Pv),  Henry's  Law 
constant  (H),  and  bioconcentration  factor  (BCF)  are  considered  key  properties  used  to  assess  the 
mobility  and  distribution  of  a  organic  chemical  in  environmental  systems. 

One  major  limitation  to  the  use  of  environmental  fate  models  has  been  the  lack  of  suitable 
values  for  many  of  these  properties.  The  scarcity  of  data,  due  mainly  to  the  difficulty  and  cost 
involved  in  experimental  determination  of  such  properties  for  an  increasing  number  of  synthetic 
chemicals,  has  resulted  in  an  increased  reliance  on  the  use  of  estimated  values. 

Quantitative  Property-Property  Relationships  (QPPRs),  based  on  the  relationship  between  two 
properties  as  determined  by  regression  analysis,  are  used  to  predict  the  property  of  interest  from 
another  more  easily  obtained  property.  Quantitative  Structure- Property  Relationships  (QSPRs) 
often  take  the  form  of  a  correlation  between  a  structurally  derived  parameter(s),  such  as  molecular 
connectivity  indices  (MCIs)  or  total  molecular  surface  area  (TSA)  and  the  property  of  interest. 

Selection  and  application  of  the  most  appropriate  QPPRs  or  QSPRs  for  a  given  compound  is 
based  on  several  factors  including:  the  availability  of  required  input,  the  methodology  for 
calculating  the  necessary  topological  information,  the  appropriateness  of  correlation  to  chemical  of 
interest  and  an  understanding  of  the  mechanisms  controlling  the  property  being  estimated. 

Incorporation  of  QPPRs  and  QSPRs  into  a  computer  format  is  a  logical  and  necessary  step  to 
gain  full  advantage  of  the  methodologies  for  simplifying  fate  assessment.  A  Property  Estimation 
Program  (PEP),  utilizing  MCI-property,  TSA-property  and  property-property  correlations  and 
UNIF AC-derived  activity  coefficients,  has  been  developed  for  the  Apple  Macintosh  microcomputer 
to  provide  the  user  with  several  approaches  to  estimate  S,  Kow,  Pv,  H,  Koc  and  BCF  depending 
on  the  information  available. 
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Structural  information  required  for  the  MCI  and  UNIFAC  calculation  routines  can  be  entered 
using  either  Simplified  Molecular  Identification  ar.d  Line  Entry  System  (SMILES)  notation  or 
connection  tables  generated  with  commercially  available  two-dimensional  drawing  programs.  The 
TSA  module  accepts  3-D  atomic  coordinates  entered  manually  or  directly  reads  coordinate  files 
generated  by  molecular  modeling  software.  The  program's  built-in  intelligence  helps  the  user 
choose  the  most  appropriate  QSPR  based  on  the  structure  of  the  chemical  of  interest.  In  addition, 
the  statistical  information  associated  with  each  QSPR  in  PEP  can  be  displayed  to  help  the  user 
determine  the  model's  validity.  For  the  regression-based  modules,  assessments  of  accuracy  based 
on  the  95%  confidence  interval  and  estimated  precision  of  the  experimental  values  are  also 
provided  along  with  the  estimated  property  value. 

PEP  also  provides  a  batch  mode  that  provides  users  with  a  method  for  the  convenient, 
unattended  calculation  of  MCIs,  TSA  and  UNIFAC  activity  coefficients  and  the  subsequent 
estimation  of  physical  properties  for  large  numbers  of  compounds. 

A  chemical  property  database,  containing  experimental  values  of  S,  Kow,  H,  Pv,  Koc,  and 
BCF  complied  from  a  variety  of  literature  sources  and  computerized  databases  was  used  for 
developing  the  MCI-property,  TSA-property  and  property-property  relationships  used  in  PEP. 
This  database,  which  currently  contains  over  800  chemicals,  is  linked  direcdy  to  PEP. 

The  property  estimation  modules  in  PEP  are  also  linked  directly  to  the  Level  1  and  2  Fugacity 
Models.  The  combination  of  the  various  property  estimation  methods,  chemical  property  database, 
and  simple  environmental  fate  models  provides  users  with  a  methodology  for  predicting  the 
environmental  distribution  of  an  organic  chemical  in  a  multi-phase  system  requiring  only  the 
structure  of  the  chemical  of  interest  as  input. 
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OBJECTIVES  OR  STATEMENT  OF  WORK 


The  primary  goal  of  this  project  was  to  develop  a  microcomputer-based  decision  support 
system  utilizing  Quantitative  Structure  Property  Relationships  (QSPRs)  and  Quantitative  Property 
Property  Relationships  (QPPRs)  to  predict  the  physical/chemical  properties  of  an  organic  chemical 
which  are  necessary  to  model  its  environmental  fate.  The  following  specific  properties  were 
investigated:  aqueous  solubility  (S),  octanol/water  partition  coefficient  (Kow),  vapor  pressure 
(Pv),  organic  carbon  normalized  soil/water  partition  coefficient  (Koc),  Henry's  Law  constant  (H), 
and  bioconcentration  factor  (BCF). 

In  order  to  achieve  the  primary  goal  of  this  research,  the  following  specific  objectives  were 
accomplished: 

1.  A  database  of  experimentally  determined  values  of  S,  Kow,  Pv,  Koc,  H  and  BCF  was 
compiled  for  over  800  organic  compounds  exhibiting  a  broad  range  of  properties  and  expected 
mobility. 

2.  Algorithms  to  calculate  molecular  connectivity  indices  (MCIs),  total  molecular  surface  area 
(TSA)  and  UNIFAC  activity  coefficients  were  adapted/developed  to  run  in  a  microcomputer 
environment  using  SMILES  notation,  connection  files,  or  coordinate  files  to  input  required 
structural  information. 

3.  Using  the  database  described  in  Objective  1  and  the  computational  methods  developed  in 
Objective  2,  a  variety  of  QSPRs  and  QPPRs  for  estimating  S,  Kow,  Pv,  Koc,  H  and  BCF 
were  developed. 

4.  Created  a  microcomputer-based  decision  support  system  that  uses  chemical  structure 
information  to  aid  the  user  in  choosing  the  most  appropriate  QSPR  or  QPPR. 

5 .  Linked  property  estimation  routines  and  property  database  to  simple  environmental  fate  models 
( the  Level  1  and  2  Fugacity  Models)  to  provides  users  with  a  methodology  for  predicting  the 
environmental  distribution  of  an  organic  chemical  in  a  multi-phase  system  requiring  only  the 
structure  of  the  chemical  of  interest  as  input 
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BACKGROUND  AND  SIGNIFICANCE 


Mathematical  models  are  often  used  to  estimate  the  fate  and  impact  of  organic  chemicals  in  the 
environment.  These  models  often  idealize  the  environment  as  a  system  of  connected 
compartments,  i.e.  water,  soil,  sediment,  air  and  biota.  The  complexity  of  these  models  range 
from  simple  steady  state  models  to  non-steady  state  models  which  include  a  large  number  of 
compartments,  transport  between  compartments  and  degradation  processes. 

Use  of  these  models  requires  a  variety  of  input  parameters  which  describe  site  and  contaminant 
physical-chemical  and  biological  characteristics.  Aqueous  solubility  (S),  octanol/water  partition 
coefficient  (Kow),  the  organic  carbon  normalized  soil/water  sorption  coefficient  (Koc),  vapor 
pressure  (Pv),  Henry's  Law  constant  (H),  and  bioconcentration  factor  (BCF)  are  considered  key 
properties  used  to  assess  the  mobility  and  distribution  of  a  chemical  in  environmental  systems. 

One  major  limitation  to  the  use  of  environmental  fate  models  has  been  the  lack  of  suitable 
values  for  many  of  these  properties.  The  scarcity  of  data,  due  mainly  to  the  difficulty  and  cost 
involved  in  experimental  determination  of  such  properties  for  an  ever  increasing  number  of 
synthetic  chemicals,  has  resulted  in  an  increased  reliance  on  the  use  of  estimated  values. 
Quantitative  Property-Property  Relationships  (QPPRs)  and  Quantitative  Structure- Property 
Relationships  (QSPRs)  have  been  used  by  environmental  scientists  and  engineers  to  obtain 
estimated  values  for  a  variety  of  physical/chemical  properties  for  use  in  environmental  fate  and 
assessment  modeling. 

QPPRs,  based  on  the  relationship  between  two  properties  as  determined  by  regression 
analysis,  are  used  to  predict  the  property  of  interest  from  another  more  easily  obtained  property 
without  a  specific  concern  for  molecular  structure.  Frequently,  the  regression  expressions  are 
expressed  in  terms  of  the  log  of  the  two  properties.  Researchers  have  found  that  a  number  of 
environmental  properties  can  be  related  to  one  another  in  this  manner.  For  example,  QSPRs  have 
been  developed  to  estimate  S,  Koc  and  BCF  from  Kow  and  Koc  and  BCF  from  S  [  i-3]. 
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QSPRs  are  methods  by  which  the  properties  of  a  chemical  can  be  inferred  or  calculated  from  a 
knowledge  or  the  structure  of  a  molecule.  QSPRs  often  take  the  form  of  a  correlation  between  a 
structurally  derived  parameters)  and  the  property  of  interest.  For  example,  relationships  between 
structurally  derived  parameters,  such  as  molecular  connectivity  indices  (MCIs)  and  total  molecular 
surface  area  (TSA)  and  properties  such  as  S,  Kow,  BCF,  and  H  have  been  reported. 

Molecular  connectivity  developed  by  Randic'  [4]  and  refined  and  expanded  by  Kier  and  Hall 
[5-7]  is  a  method  of  bond  counting  from  which  topological  indices,  based  on  the  structure  of  the 
compound,  can  be  derived.  For  a  given  molecular  structure,  several  types  and  order  of  MCIs  can 
be  calculated.  Information  on  the  molecular  size,  branching,  cyclization,  unsamration  and 
heteroatom  content  of  a  molecule  is  encoded  in  these  various  indices  [5].  MCI  have  been  used  to 
predict  Koc  [8,9],  S  [1],  Kow  [10J,  H  [11]  and  BCFs  [12], 

A  direct  estimation  of  molecular  surface  area  based  on  the  concept  of  van  der  Waals  radius, 
TSA  has  been  correlated  with  S,  Kow,  Pv  and  H  [13-22].  Several  different  algorithms,  requiring 
the  3-D  atomic  coordinates  of  the  solute  molecule  and  the  van  der  Waals  radii  of  solute  and  solvent 
molecules  as  input  [19,23],  have  been  developed  to  calculate  TSA. 

Group  contribution  or  fragment  constant  methods  are  another  important  category  of  QSPRs. 
The  basic  idea  of  a  group  contribution  method  is  that  while  there  is  an  enormous  number  of 
chemical  compounds,  both  synthetic  and  naturally  occurring,  the  number  of  functional  groups  that 
make  up  these  compounds  is  much  smaller.  A  single  numerical  value  is  assumed  to  represent  the 
contribution  of  each  functional  group  (i.e.  a  specified  atom,  a  group  of  atoms  bonded  together  or 
structural  factor)  to  the  physical  property  of  interest.  It  is  also  usually  assumed  that  the 
contributions  made  by  each  group  are  independent  of  each  other.  By  summing  up  the  values  of  the 
various  fragments  or  groups  the  property  of  interest  can  be  directly  calculated. 

The  UNIFAC  (UNIQUAC  Functional  Group  Activity  Coefficient)  group  contribution  method 
[24-26]  has  been  used  by  environmental  researchers  to  estimate  S  and  Kow  [27-31].  The 
UNIFAC  method  was  developed  to  estimate  liquid  phase  activity  coefficients  in  mixtures  of 
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nonelectrolytes  [25].  In  this  technique,  the  activity  coefficient  is  divided  into  two  pans,  a 
combinatorial  part  which  reflects  the  size  a.id  shape  of  the  molecule  present  and  a  residual  ponion 
which  depends  cn  functional  group  interactions.  Various  parameters,  such  as  van  der  Waals  group 
volumes  and  surface  areas  and  group  interaction  parameters,  are  input  into  a  series  of  equations 
from  which  the  combinatorial  and  residual  parts  are  calculated.  Values  for  the  group  parameters 
have  been  tabulated  and  can  be  found  in  the  literature[25,26].  UNIFAC  is  specifically  designed  to 
take  into  account  interactions  between  groups  and  is  appropriate  for  multiple  solute/solvents 
systems  UNIFAC  also  permits  estimates  to  be  made  as  a  function  of  temperature. 

In  most  cases,  more  than  one  estimation  method  is  available  for  a  particular  property. 
Estimation  methods  however,  have  widely  varying  accuracies  and  indiscriminate  use  of  these 
techniques  can  result  in  large  errors.  Selection  and  application  of  QSPR  or  QPPR  methods 
requires  varying  degrees  of  expertise  that  depend  on  the  structure  of  £  particular  :hemical  of 
interest,  knowledge  of  the  mechanism  of  the  process,  the  extent  of  the  database  used  to  develop  the 
QSPR  or  QPPR  and  the  complexity  of  the  structural  analysis  required  to  relate  structure  to  the 
property.  For  example,  some  QSPR  and  QPPRs  are  broader  than  others  in  the  range  of  chemicals 
that  are  covered,  and  some  methods  have  been  established  with  a  better  understanding  of  the 
mechanisms  or  properties  involved.  In  many  cases  estimation  methods  are  developed  from 
empirical  or  semiempirical  correlations.  The  success  of  the  correlation  is  dependent  on  many 
factors  including  the  type  and  number  of  compounds  used  in  its  development. 

Incorporation  of  QSPR  and  QPPRs  into  a  computer  format  is  a  logical  and  necessary  step  to 
gain  full  advantage  of  the  methodologies  for  simplifying  fate  assessment.  A  practical  computerized 
property  estimation  program,  utilizing  QSPR  and  QPPRs,  should  include  the  following  attributes: 
be  simple  and  flexible  to  use  for  both  experts  and  non-experts,  irclude  sufficient  statistical 
information  regarding  the  development  of  the  QSPRs  and  QPPRs  so  that  the  range  of  applicability 
of  such  models  can  be  evaluated,  and  provide  an  indication  of  the  accuracy  of  the  estimated 
property. 
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A  microcomputer  based  Property  Estimation  Program  (PEP),  utilizing  MCI-property,  TSA- 
property  and  property-property  correlations  and  UNIFAC  derived  activity  coefficients,  was 
developed  to  provide  both  experts  and  non-expens  with  a  fast,  economical  method  to  estimate  a 
compound's  S,  Kow,  Pv,  Koc,  H,  and  BCF  for  use  in  environmental  fate  modeling.  The  user  can 
input  the  required  structural  information  for  the  MCI  and  UNIFAC  calculation  routines  using  either 
SMILES  notation  or  coordinate  files  (connection  table  or  “Molfile”  formats)  generated  with 
commercially  available  two-dimensional  drawing  programs  such  as  ChemDraw™  [45], 
Chemintosh  ™,  or  ISIS/Draw  ™.  The  TSA  module  accepts  3-D  atomic  coordinates  entered 
manually  or  directly  reads  coordinate  files  generated  by  molecular  modeling  software  such  as 
Alchemy  III™  or  Chem3D  Plus™.  For  property-property,  TSA-property  and  MCI-property 
modules,  the  user  can  select  from  either  "universal"  or  class  specific  regression  models.  To  aid  the 
user  in  choosing  the  most  suitable  regression  model,  the  program  automatically  suggests  the  most 
appropriate  regression  model(s)  based  on  the  structure  of  the  compound.  In  addition,  the  statistics 
associated  with  each  model  can  be  displayed  along  with  the  list  of  compounds  used  in  developing 
the  model.  For  the  regression  based  modules,  assessments  of  accuracy  ba^ed  on  the  95% 
confidence  interval  and  estimated  precision  of  the  experimental  values  are  provided  along  with  the 
estimated  property  value.  Additional  correlation  models  can  be  easily  added  to  PEP  by  the  user. 

A  chemical  property  database,  containing  experimental  values  of  S,  Kow,  H,  Pv,  Koc,  and 
BCF  complied  from  a  variety  of  literature  sources  and  computerized  databases  was  used  for 
developing  the  MCI-property,  TSA-property  and  property-property  relationships  used  in  PEP. 
This  database,  containing  over  800  chemicals,  is  linked  directly  to  PEP  and  provides  the  means  for 
the  user  to  search  for  chemical  compounds  by  full  or  partial  name  or  synonym,  to  sort  the 
compounds  by  name,  boiling  point,  melting  point,  or  molecular  weight,  and  the  ability  to  transfer 
to  any  of  the  property  estimation  modules. 

In  addition  to  the  physical  properties,  the  database  was  recently  modified  to  allow  the  user  to 
enter  information  pertaining  to  a  compound’s  persistence  and  toxicity.  Biodegradation  rates, 
hydrolysis  rates,  photolysis  rates  and  LC  50  values,  along  with  the  references  and  comments 
associated  with  each  property,  can  be  stored  in  the  database. 
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To  illustrate  the  potential  application  of  PEP,  the  property  estimation  modules  are  linked 
directly  to  the  Level  1  and  2  Fugacity  Models  developed  by  Mackay  [32],  These  simple  models 
calculate  the  equilibrium  distribution  of  an  organic  chemical  between  water,  air,  soil,  sediment, 
suspended  sediment  and  biota  phases  in  a  user  defined  world.  The  combination  of  PEP  and 
Fugacity  models  provides  users  with  a  methodology  for  predicting  the  environmental  distribution 
of  an  organic  chemical  in  a  multi-phase  system  requiring  only  the  structure  of  the  chemical  of 
interest  as  input.  The  development  and  use  of  the  PEP  system  will  be  described. 
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STATUS  OF  RESEARCH  EFFORT 


General  Programming  Description 

HyperCard 

HyperCard  is  a  program  that  was  developed  for  the  Apple  Macintosh  series  of  personal 
computers  to  enable  novice  Macintosh  programmers  to  write  user  friendly  computer  applications. 
HyperCard,  which  is  provided  with  every  Macintosh  sold,  offers  graphics,  information  storage, 
and  the  means  to  display  information  in  a  variety  of  formats.  HyperTalk  is  a  high-level, 
interpreted  language  used  to  establish  links  between  related  information  and  perform  simple 
calculations  within  HyperCard.  HyperCard  also  allows  the  programmer  to  create  extensions  of 
HyperTalk  in  a  lower  level  language  (i.e.,  C  or  Fortran).  These  extensions,  called  external 
functions  (XFCN)  and  external  commands  (XCMD),  greatly  increase  the  speed  of  repetitive  and 
calculation  intensive  algorithms  over  using  HyperTalk  itself.  XFCNs  and  XCMDs  can  also  be 
used  to  implement  custom  Macintosh  features  such  as  popup  menus  and  custom  dialog  boxes. 

Cards 

Each  screen  of  information  in  HyperCard  is  termed  a  card.  Each  card  can  contain  graphics,  fields, 
and  buttons.  The  data  on  a  card  is  held  in  the  fields,  and  the  buttons  are  used  to  initiate  action 
procedures  that  operate  on  the  data.  The  fields  and  buttons  allow  the  standard  Macintosh  interface 
to  be  used  without  the  direct  use  of  the  cumbersome  Macintosh  toolbox  routines.  To  create  a  user 
interface  the  HyperCard  programmer  simply  draws,  or  creates  the  buttons  and  fields.  The  link 
between  buttons,  fields  and  cards  is  done  through  HyperTalk  scripts.  A  script  is  a  set  of 
HyperTalk  statements  linked  to  a  button,  field,  card,  or  stack. 
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Stacks 


Cards  are  put  together  in  HyperCard  files  called  stacks.  A  stack  can  contain  from  1  to  16,(XX) 
cards  depending  on  the  amount  of  memory  each  card  requires.  Usually  each  stack  contains  cards 
that  are  related  either  by  purpose  or  visual  similarity.  The  movement  from  stack  to  stack  is  rapid 
and  easy  to  accomplish  using  either  the  standard  Macintosh  menus  or  HyperTalk  scripts. 

External  functions  and  commands 

Some  of  the  custom  features  used  to  enhance  PEP  were  implemented  using  commercial 
XFCNs  and  XCMDs.  Table  1  lists  the  commercial  XFCNs  and  XCMDs  that  were  used,  their 
creator,  and  action. 


Table  1.  The  commercial  XFCNs  and  XCMDs  used 


XFCNorXCMD 

Creator 

Use 

popUp 

Adrian  Freed  (1989a)  [42] 

makes  a  pop  up  menu 

ShowDialog 

Jay  Hodgdon  (1988a)  [40] 

shows  a  modal  Dialog 

Progress 

Jay  Hodgdon  (1988b)  [41] 

shows  a  dialog  box  with 
a  progress  pointer 

The  XFCNs  and  XCMDs  that  were  used  in  PEP  were  created  using  Think  C  versions  4.0  and 
5.0  from  Symantec  Corporation  (1991)  [44].  “Glue“  routines  are  used  to  facilitate  the 
communication  between  HyperCard  and  XFCNs  or  XCMDs.  HyperCard  glue  is  furnished  with 
Think  C.  XTRA  Shell  by  Adrian  Freed  (1989b)  [43]  was  also  used  to  develop  XFCNs  and 
XCMDs.  XTRA  also  contains  HyperCard  glue  plus  a  simple  to  use  set  of  functions  that  can  be 
called  from  Think  C. 
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Algorithms 


TSA  algorithm 

Total  Surface  Areas  (TSA)  are  calculated  using  a  modified  version  of  the  SALV02  algorithm 
developed  by  Pearlman  [16].  SALV02,  a  FORTRAN  program  designed  to  run  on  main  frame 
computers,  was  translated  to  the  C  computer  language  using  Cobalt  Biue’s  [46]  FOR_C  translator 
version  2.9  (1989).  This  translation  enabled  the  SALV02  algorithm  to  be  made  into  an  XFCN  and 
linked  directly  to  a  HyperCard  stack. 

MCI  calculation  method 

A  C  language  program  was  written  for  the  calculations  of  the  MCIs  based  on  code  described  by 
Frazier  [35].  The  algorithm  currently  calculates  54  (0  to  6  order)  bond,  valance,  and  path  indices, 
and  7  (0  through  6  order)  A  valence  indices  if  the  molecule  contains  any  nitrogen  or  oxygen  atoms. 
A  more  detailed  discussion  of  the  MCI  calculation  procedure  is  provided  in  the  literature  review 
section. 

UNEFAC  calculation  method 

The  UNEFAC  procedure,  as  described  by  Grain  [38],  was  incorporated  into  HyperTalk  scripts 
and  XFCNs.  The  group  contribution  factors  were  also  taken  from  Grain  [38]  and  are  derived  from 
vapor-liquid  equilibria  data. 

Fugacity  level  1  model 

The  Fugacity  level  1  model,  described  by  Mackay  [32,  39],  was  implemented  in  HyperTalk. 
This  model  is  used  to  estimate  the  distribution  of  a  chemical  in  a  user  defined  environment 
consisting  of  a  maximum  of  six  compartments:  air,  water,  soil,  sediment,  suspended  solids,  and 
biota.  The  default  compartment  volumes  and  densities  were  also  taken  from  Mackay  [32]. 
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The  graphs  which  show  the  distribution  of  the  chemical  are  drawn  using  routines  from 
GraphMaker  a  HyperCard  stack  included  with  version  2.0  of  HyperCard.  A  detailed  description 
of  the  PEP  implementation  of  the  Fugacity  level  1  model  is  provided  later. 

Development  of  QSPRs  and  QPPRs 

The  QSPRs  and  QPPRs  utilized  in  PEP  were  developed  using  both  statistical  and  intuitive 
criteria.  The  QSPRs  were  First  derived  using  the  stepwise  regression  features  in  StatView  II,  a 
statistical  analysis  package  by  Abacus  Concepts  Inc.  (1988)  [47].  The  results  from  the  stepwise 
regression  procedure  were  analyzed  and  the  variables  containing  theoretical  information  were  left  in 
the  regression  equation.  The  final  regression  equation  was  chosen  to  include  both  a  size  term  and  a 
measure  of  the  polar  nature. 

After  the  regression  equations  were  chosen  the  final  calculations  of  the  Analysis  of  Variance 
table  and  the  graphs  were  obtained  using  Data  Desk  by  Odesta  Corporation  (1989)  [48].  Both 
universal  and  class  specific  equations  for  each  property  were  developed  and  evaluated.  All  of  the 
universal  relationships  and  the  class  specific  relationships  that  were  found  to  be  significant  to  the 
90  percent  level  were  incorporated  in  PEP. 

PEP  Hardware/software  Requirements 

PEP  requires  the  following  system  configuration  to  run:  a  Macintosh  Classic,  LC,  II  series,  or 
PowerBook  computer,  with  a  hard  disk;  HyperCard  2.0  or  greater  software;  Macintosh  system 
software  version  6.0.5  or  greater,  running  under  MultiFinder,  and  a  minimum  of  2  megabytes  of 
memory  (RAM),  with  1000  kBytes  of  memory  allocated  for  HyperCard. 

PEP  overview 

The  PEP  system  currently  consists  of  four  HyperCard  stacks:  PEP  Processor,  PEP  Models, 
PEP  Help  and  Chemical  Property  Database.  A  flowchart  illustrating  the  overall  operation  of  PEP 
is  provided  in  Figure  1. 
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Figure  1.  Flow  chart  illustrating  the  overall  operation  of  PEP 


Typically,  users  would  first  look  for  the  required  property  information  in  the  PEP  database.  If 
the  information  is  not  contained  in  the  database,  the  user  can  then  estimate  the  property  using  one 
or  more  of  the  four  property  estimation  modules  provided.  Choosing  the  most  appropriate 
property  estimation  module  would  depend  on  what  information  regarding  the  chemical  is  available. 
The  function  and  use  of  each  stack  in  PEP  will  be  described  in  the  following  sections. 

PEP  Processor 

This  stack,  divided  into  four  sections  or  modules,  contains  the  algorithms  for  data  input, 
calculations  and  output  of  the  estimated  physical-chemical  properties.  Each  stack  will  be  described 
in  detail  in  the  following  sections. 


The  overall  operation  of  the  MCI  module  is  illustrated  in  Figure  2. 


Use  estimated 

appropriate 
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correlation  for 
each  property 
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Select 
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or  manually) 
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I 


View 
regression 
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Figure  2.  Flow  chart  describing  operation  of  PEP  MCI  module. 


The  user  interface  of  this  module,  shown  in  Figure  3,  is  designed  in  the  form  of  a  flow  chart. 
Upon  entering  the  MCI  module  the  user  must  first  input  the  necessary  structural  information  using 
either  SMILES  [33,34]  notation  or  connection  files  generated  from  ChemDraw™,  Chemintosh™, 
or  ISIS/Draw™,  commercially  available,  Macintosh  compatible  two-dimensional  (2D)  drawing 
programs. 

SMILES  is  a  chemical  notation  language  specifically  designed  for  computer  use.  It  is  a  method 
of  "unfolding"  a  2D  chemical  structure  into  a  single  line  of  characters  containing  the  structural 
information. 
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6  File  Edit  Go  Print 
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Figure  3.  Screen  display  of  PEP  MCI  module 


After  the  structural  information  is  entered,  MCIs  can  then  be  calculated  using  a  set  of 
HyperCard™  external  functions  (XFCN)  written  in  the  programming  language  C  based  on  code 
described  by  Frazier  [35].  The  MCI  calculation  routine  in  PEP  calculates  simple,  bond  and  valence 
indices  of  several  types  (path,  cluster,  chain,  and  path/cluster)  and  orders  (0  through  6),  if 
possible,  for  each  molecule,  resulting  in  a  maximum  of  54  index  values  for  each  molecule  which 
can  be  displayed  on  screen  and/or  output  to  a  printer.  To  account  for  non-dispersive  force  effects 
on  aqueous  solubility  and  solubility  related  properties,  zero  through  six  order  A  valence  path 
indices  (Ax),  as  described  by  Bahnick  and  Doucette  [36],  are  calculated  by  PEP,  in  addition  to  the 
54  indices  described  above.  To  calculate  Ax  indices,  a  nonpolar  equivalent  is  made  by  substituting 
C  for  O  or  N  atoms.  MCIs  are  calculated  for  the  nonpolar  equivalent  and  values  for  Ax  can  be 
computed  for  each  type  of  index  by: 
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A*  =  (X)np  -  X  0) 

After  the  MCIs  are  calculated,  they  can  be  displayed  or  printed  if  desired  and  the  user  can  then 
choose  which  properties  are  to  be  estimated.  For  each  property,  two  categories  of  MCl-property 
relationships  are  displayed.  MCIs  property  relationships,  both  class  specific  and  “universal”,  that 
were  developed  in  this  project  using  the  experimental  values  reported  in  the  PEP  property  database 
are  preceded  with  the  word  PEP.  “Universal”  MCI-property  relationships  were  developed  using 
all  available  experimental  data  for  a  given  property  regardless  of  chemical  class.  “Class-specific” 
MCI-property  relationships  were  developed  if  property  values  were  available  for  a  sufficient 
number  (10  or  greater)  of  compounds  within  a  particular  chemical  class  (PCBs,  PAHs,  ureas, 
etc.).  In  addition,  several  multi-class  MCI-property  correlations  were  developed  for  more  broad 
classes  of  compounds  such  as:  halogenated  aliphatics  and  halogenated  aromatics.  An  example 
illustrating  the  potential  hierarchy  of  MCI-property  relationships  available  to  the  user  for  the 
predicting  the  vapor  pressure  (Pv)  of  a  polychlorinated  biphenyl  (PCB)  is  shown  below.  There  are 
three  MCI-property  relationships,  one  developed  using  only  PCBs,  one  using  halogenated 
aromatics  including  PCBs  and  one  using  all  compound  types: 

log  Pv  =  5.814  (nc5)  -  2.428  (np3)  +  9.479  (PCBs) 

log  Pv  =  - 1 .559  (bp  1 )  +  6.622  (Halogenated  aromatics) 

log  Pv  =  -1.275  (np3)  +5.261  (Universal) 

Generally,  the  use  of  a  “class-specific”  relationship,  if  available,  should  provide  the  best 
estimate  (i.e.  the  estimate  associated  with  the  least  amount  of  uncertainty). 

By  looking  for  a  group  of  atoms  and  bonds  that  distinguish  a  chemical  class,  PEP  uses  the 
structural  information  contained  in  the  SMILES  string  or  connection  file  input  to  aid  users  in 
choosing  the  most  appropriate  MCI-property  relationships.  The  number  of  appropriate 
relationships  or  chemical  classes  that  are  chosen  by  the  program,  denoted  with  a  diamond  in  the 
popup  menu,  is  determined  by  the  number  of  different  distinguishing  subgroups  that  are  found.  In 
addition,  a  summary  of  the  regression  statistics  and  list  of  compounds  used  to  develop  and  evaluate 
each  MCI-property  relationship  can  be  displayed  by  clicking  the  “eye”  or  “view  statistics  option” 
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found  at  the  left  of  each  regression  model.  Information  displayed  on  the  statistics  card  includes: 
the  MCI-property  regression  equation,  the  list  of  chemicals  used  in  developing  the  regression 
model,  the  standard  errors  of  the  coefficients  in  the  regression  equation,  the  Analysis  of  Variance 
(ANOVA)  table,  the  r^  value,  a  graph  of  the  the  predicted  vs.  estimated  values,  a  graph  of  the 
residuals  vs.  the  predicted  values,  a  graph  of  the  residuals  vs.  the  number  of  standard  deviations 
and  appropriate  reference.  An  example  of  the  statistical  information  provided  for  each  MCI- 
property  relationship  is  shown  in  Figure  4. 
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Figure  4.  Example  statistics  card  from  PEP 


The  2nd  category  of  MCI-property  correlations,  located  below  the  PEP  relationships,  were 
complied  from  various  literature  sources.  Clicking  on  the  "book"  icon  will  display  the  reference 
and  information  regarding  the  number  and  type  of  compounds  included  in  the  correlation  if  it  was 
available  in  the  original  literature. 
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After  choosing  the  most  appropriate  regression,  estimates  for  the  selected  properties  can  be 
made.  As  shown  in  Figure  5,  the  MCI  module  results  card  provides  an  estimate  of  the  property 
along  with  its  calculated  accuracy  based  on  both  the  95%  confidence  interval  calculated  from  the 
regression  and  the  estimated  precision  associated  with  the  experimental  determination  of  the 
property.  In  addition,  the  user  has  the  option  to  search  the  property  database  for  actual 
experimental  values  if  they  are  available  for  comparison. 
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*Note:  The  values  shown  are  estimated  at  25*C  ±  the 
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Figure  5.  Results  card  from  PEP  MCI  module. 


TSA  module  The  TSA  module  is  similar  in  operation  to  the  MCI  module.  However,  unlike 
molecular  connectivity,  the  calculation  of  TSA  requires  information  describing  the  geometry  of  the 
molecule  in  terms  of  its  3-D  atomic  coordinates.  The  TSA  module,  shown  in  Figure  6,  accepts  3- 
D  atomic  coordinates  entered  manually  or  directly  reads  coordinate  files  generated  by  commercially 
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available,  Macintosh  compatible,  molecular  modeling  software  such  as  Alchemy  111™  or  Chem3D 
Plus™. 


Figure  6.  TS  A  module  card  from  PEP 


The  TSA  module  is  also  designed  to  accept  files  generated  by  other  hardware/software 
combinations  including  UNIX  or  VAX  versions  of  CONCORD  (Tripos  Associates,  Inc.),  a  hybrid 
expert  system  and  molecular  modeling  software  designed  for  the  rapid  generation  of  high  quality 
approximate  3-D  molecular  structures.  In  addition  to  the  3-D  molecular  structure,  the  user  must 
also  input  van  der  Waals  radii  for  each  of  the  atoms.  A  editable  table  of  van  der  Waal  radii, 
obtained  from  Bondi  [37]  for  most  common  atoms,  is  provided  within  the  TSA  module.  Once  the 
molecular  geometry  and  the  van  der  Waal  radii  are  input,  TSA  can  be  calculated  using  a  XFCN 
which  was  adapted  from  the  SALV02  algorithm  developed  by  Pearlman  [19].  This  algorithm 
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represents  each  atom  of  a  molecule  by  a  sphere  centered  at  the  equilibrium  position  of  the  nucleus. 
The  radius  of  the  sphere  is  equal  to  that  of  the  van  der  Waals  radius.  Planes  of  intersection 
between  spheres  are  used  to  estimate  the  contribution  to  surface  area  from  the  individual  atoms  or 
groups.  The  program  computes  the  surface  area  of  individual  atoms  or  group  by  numerical 
integration,  and  the  overlap  due  to  intersecting  spheres  is  excluded  from  the  calculation.  TSA  is 
calculated  by  the  summation  of  individual  group  contributions.  The  program  also  allows  the  TSA 
of  the  solute  molecule  to  be  calculated  after  the  addition  of  a  suitable  solvent  radius.  A  more 
detailed  description  of  the  TSA  calculation  method  is  provided  by  Pearlman  [19]. 

After  the  TSA  has  been  calculated,  the  user  then  chooses  the  properties  of  interest  and  a 
regression  equation  for  each  using  the  same  approach  as  described  in  the  MCI  module.  If  the 
SMILES  string  or  the  connection  table  is  also  input,  the  most  appropriate  TSA-property 
relationship(s)  will  be  flagged  in  the  popup  menu.  The  operation  of  the  TSA  module  from  this 
point  on  is  identical  to  that  of  the  MCI  module. 

UNIFAC  module  Like  the  MCI  module,  the  UNIFAC  module,  illustrated  in  Figure  7,  requires 
either  a  SMILES  string  or  a  connection  table  as  input.  An  XFCN  converts  the  structural 
information  provided  by  the  SMILES  string  or  connection  file  into  valid  UNIFAC  subgroups  and 
counts  the  number  of  each  subgroups  present.  In  order  to  break  the  structure  into  the  proper 
subgroups,  the  SMILES  string  or  the  connection  file  is  interpreted  and  the  information  is  put  into  a 
matrix.  Each  row  and  column  in  the  matrix  represents  an  atom  in  the  chemical.  The  matrix 
contains  the  bond  order  between  the  two  atoms  that  correspond  to  each  entry  in  the  matrix.  If  two 
atoms  are  not  connected  then  a  0  is  placed  in  the  corresponding  entry  in  the  matrix.  After  the 
matrix  is  built  the  algorithm  then  "asks"  specific  questions  about  each  atom,  its  neighbors,  and 
how  it  is  connected.  If  the  answers  to  a  set  of  questions  are  all  true  then  a  subgroup  was  found, 
the  atoms  are  put  together,  and  the  matrix  is  reduced.  The  questions  are  then  asked  over  again  and 
the  next  subgroup  is  chosen,  this  repeats  until  no  more  subgroups  are  found.  The  questions  are 
asked  in  a  specific  sequence  so  that  the  resulting  subgroups  are  independent  of  the  order  of  the 
atoms  in  the  matrix. 
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The  UNIFAC  method  for  calculating  activity  coefficients,  as  described  by  Grain  [38],  is 
implemented  using  both  HyperTalk  and  an  XFCN.  The  functional  group  interaction  parameters, 
presented  by  Gmehling  et  al.  [26]  and  derived  from  vapor-liquid  equilibria  (VLE),  are  used  in  the 
calculation  routine  but  can  be  changed  by  the  user.  After  the  activity  coefficients  arc  calculated  they 
can  be  displayed  along  with  relevant  intermediate  values  and  used  to  estimate  S  and  Kow  by  the 
following  expressions  (Arbuckle,  1986): 

Kow  =  0. 1 1 5  y=»w  lfx>o  (2) 

S  (mol/L)  =  55.6  /  y»w  (3) 

where  y«>w  is  the  activity  coefficient  of  the  chemical  infinitely  dilute  in  water  and  y°°o  is  the 
activity  coefficient  of  the  chemical  infinitely  dilute  in  octanol  [27]. 

Propertv/Propertv  Module  Input  for  the  Property/Property  module,  shown  in  Figure  8, 
depends  on  the  the  properties  to  be  estimated  and  the  regression  models  used.  Thus,  the  user  must 
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select  the  properties  to  be  estimated  and  the  property-property  relationships  (regression  equations) 
to  be  used  before  any  input  values  are  requested.  The  program  keeps  track  of  the  inputs  required 
and  provides  the  appropriate  input  fields.  If  available,  the  required  properties  can  be  imported 
directly  from  the  associated  chemical  property  database.  Information  regarding  the  regression 
statistics,  if  available,  is  also  provided  as  previously  described  in  the  MCI  module.  After  the 
necessary  properties  are  entered  into  the  corresponding  input  fields,  the  properties  of  interest  can 
be  estimated  and  the  results,  along  with  the  95%  prediction  interval  (if  the  necessary  data  is 
available)  can  be  viewed. 
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Figure  8.  PEP  Property/Property  correlation  module. 
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PEP  Models 


To  illustrate  the  practical  application  of  PEP,  an  additional  stack  called  PEP  Models  was 
developed.  This  stack,  which  contains  the  algorithms  for  the  Level  1  and  2  Fugacity  Models  [32], 
is  linked  directly  to  the  the  PEP  Processor,  but  can  also  be  used  independently. 

The  Level  1  Fugacity  Model  considers  a  unit  world  consisting  of  six  compartments:  air,  water, 
soil,  suspended  solids,  sediment,  and  biota  as  illustrated  below  in  Figure  9.  The  model  predicts 
the  equilibrium  concentrations  of  the  chemical  of  interest  in  each  compartment  using  the  fugacity 
approach  described  by  Mackay  [32,  39].  The  model  requires  the  input  of  K^,  H  and  BCF  which 
can  be  read  directly  from  the  PEP  processor  or  the  PEP  chemical  property  database,  if  available. 
In  addition  to  the  chemical  specific  properties,  the  density  and  volume  of  each  compartment  must 
be  specified  along  with  the  organic  carbon  content  of  the  soil,  sediment  and  suspended  sediment. 
An  editable  set  of  default  values  for  compartment  density,  volume  and  organic  carbon  content,  as 
suggested  by  Mackay,  is  provided. 


Figure  9.  Representation  of  Fugacity  Level  1  Model  compartments. 


The  Level  2  Fugacity  Model  (Figure  10)  allows  for  the  chemical  of  interest  to  degrade  in  each 
compartment,  move  by  advection  through  the  water  and  air  phases,  and  be  emitted  into  the  unit 
world.  The  rate  values  for  each  of  these  processes  must  be  entered  by  the  user.  The  degradation 
rates  for  each  compartment  can  be  entered  either  by  1 1/2  values  in  hours  or  by  first  order  reaction 
rate  constants  in  1/hours.  The  advection  rate  data  can  be  entered  either  by  residence  time  or  flow 
rate  and  the  concentration  or  by  directly  entering  the  mass  flow  rate  in  moles  per  hour.  The 
emission  rate  is  entered  in  the  units  of  moles  per  hour. 


Fugacity  Model 


Chemical  Name:  2,2\6.6‘-tetrachlorob1pher»yl 


1.  Input 
Property 
Values 


Look  for 
Values  n 
FYxy-PB 


log  Koc  □ 


3.92 


log  H  ^ 

(fomriwfcg) 


-1.640 


logBCF  □ 


4.69 


13 

vain*  from  DB 


2.  Input  Enuironmental  Compartment  Values 

Comportment  Density  Volume  %  Organic 

kg/ma  m3  Carbon 

E  Air 

JJL2 _ 

_.lel6 . 

E!  Water 

1000 

Jj& _ 

El  So  U 

El  Susp.  Solids 

El  Sediment 

El  Biota 

1500 

9e3 

2 . 

1500 

..15 _ 

4 

1500 

2.1e4 

4 

1000 

3.5 

/I 


3.  Calculate 
Distribution 


n 


OR 


Input  Fugacity 
Level  2  Data 


Figure  10.  PEP  Models  card 


After  the  user  inputs  all  the  required  information  and  hits  the  “calculate  distribution”  button,  the 
model  calculations  are  performed  in  HyperTalk  and  the  results  are  presented  in  both  in  tabular  and 
graphical  form  as  illustrated  in  Figure  11.  The  graphical  display  can  be  changed  from  bar 
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(concentration  of  the  chemical  in  each  phase)  to  pie  (percent  of  the  chemical  in  each  phase)  chan 
forms  using  the  “Graph:”  popup  menu.  The  values  of  the  distribution  coefficients  that  were  used 
in  the  calculations  are  also  shown  on  the  results  card.  A  complete  description  of  these  models  has 
been  given  by  Mackay  [32,39] 
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Figure  11.  PEP  Models  results  card 

PEP  Help 

Information  regarding  the  operation  of  the  chemical  property  database  and  the  property 
estimation,  models  and  batch  modules  is  available  in  the  PEP  Help  stack.  This  stack  easily 
accessed  at  any  time  within  the  PEP  system.  The  organization  and  layout  of  each  help  card  is 
similar  to  that  illustrated  in  Figure  12  for  the  MCI  module.  The  user  can  select  the  topic  of  interest 
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by  clicking  on  the  appropriate  radio  button  and  the  information  on  that  subject  will  be  displayed  in 
the  scrolling  field. 
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structure  of  the  compound,  can  be  derived.  For  e 
given  molecular  structure,  several  types  and 
orders  of  molecular  connectivity  indexes  (MCls) 
can  be  calculated.  Information  on  the  molecular 
size,  branching,  cyclization,  unseturation,  and 
heteroatom  content  of  a  molecule  is  encoded  in 
these  various  indices  (Kier  and  Hall,  1 976). 
Molecular  connectivity  has  been  used  to  predict 
Koc  (Sabljic,  1984,  Sabljic,  1987,  Bahnick  and 
Doucette,  1988),  S  (Doucette,  1985, 
Nirmalakhandan  and  Speece,  1988a),  Kow  (Doucette 


!£l 


Figure  12.  Example  Help  card  from  PEP 


Chemical  Property  Database 

Experimentally  determined  physical  property  data  for  about  800  compounds,  having  at  least 
one  value  of  aqueous  solubility  (S),  octanol/wuter  partition  coefficient  (Kow),  vapor  pressure 
(Pv),  organic  carbon  normalized  soil  sorption  coefficient  (Koc),  bioconcentration  factor  (BCF),  or 
Henry's  law  constant  (H),  was  complied  from  a  variety  of  literature  sources  and  computerized 
databases.  Using  this  information,  a  chemical  property  database  was  constructed  using 
HyperCard™  and  subsequently  used  for  developing  MCI-property,  TSA-property  and  property- 
property  relationships.  In  addition  to  the  properties  listed  above,  the  database  includes  the 
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following  information:  compound  name  and  synonyms,  a  diagram  of  the  2-D  chemical  structure, 
SMILES  notation,  uses,  CAS  number,  chemical  formula,  molecular  weight  (MW),  boiling  point 
(BP),  melting  point  (MP),  and  appropriate  references  for  each  value.  A  built-in  unit  conversion 
utility  enables  users  to  quickly  view  property  values  in  a  variety  of  commonly  used  units.  The 
database  is  directly  connected  to  the  PEP  Processor  stack. 

The  Chemical  Property  Database  also  provides  the  means  for  the  user  to  search  for  chemical 
compounds  by  full  or  partial  name  or  synonym,  to  sort  the  compounds  by  name,  boiling  point, 
melting  point,  or  molecular  weight,  and  the  ability  to  transfer  to  any  of  the  property  estimation 
modules.  In  addition,  the  user  can  easily  edit  exiting  values,  add  new  values  or  export  information 
to  a  text  file  or  another  database. 

In  addition  to  the  physical  properties,  information  describing  the  environmental  persistence  and 
toxicity  of  specific  chemicals  can  also  be  entered  into  the  database.  Placeholders  for 
biodegradation  rates,  hydrolysis  rates,  photolysis  rates  and  LC50s,  along  with  the  appropriate 
references  and  comments  have  been  incorporated  into  the  database.  This  feature  was  added  to  the 
database  after  requests  from  test  users,  however  at  the  time  of  this  report  no  degradation  or  toxicity 
data  has  been  entered  into  the  database.  The  chemical  property  database  is  illustrated  in  Figures  13 
and  14. 

PEP  Batch 

PEP  Batch  provides  users  with  a  method  for  the  convenient,  unattended  calculation  of  MCIs, 
TSA  and  UNDFAC  activity  coefficients  and  subsequent  estimation  of  physical  properties  for  large 
numbers  of  compounds  via  the  PEP  processor  described  earlier.  Like  the  PEP  Processor,  PEP 
Batch  is  divided  into  MCI,  TSA  and  UNIFAC  modules.  Each  module,  as  illustrated  in  Figure  15 
for  the  MCI  module,  requires  the  user  to  select  the  appropriate  input  file  (i.e.  SMILES  string, 
connection  table  or  3-D  atomic  coordinates),  choose  to  information  to  be  sent  to  the  output  file  (i.e. 
chemical  name,  SMILEs  string,  properties)  and  start  the  batch  driver. 
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Chemical  Property  Data  Base 
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Figure  13  Example  card  from  the  Chemical  Property  Data  Base 
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Figure  14.  PEP  database  degradation  properties. 
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Figure  15.  Example  card  for  PEP  Batch,  MCI  module. 


The  MCI  and  UNIFAC  modules  require  the  two-dimensional  molecular  structure  of  the 
chemical  to  be  entered  using  either  SMILES  strings  or  connection  tables. 

To  enter  SMILES  strings  into  the  MCI  or  UNIFAC  batch  modules,  you  must  create  a  text  file 
containing  name  of  the  chemical  and  its  corresponding  SMILES  strings  in  columns  separated  by 
tabs  and  hard  returns  at  the  end  of  each  row.  The  text  file  can  contain  additional  tab-delimited 
information,  but  the  SMILES  strings  and  chemical  names  must  be  in  the  first  or  second  column. 
You  indicate  the  column  order  when  you  select  the  type  of  input  The  text  file  can  be  created  with 
word  processing  or  spreadsheet  programs  or  you  can  also  edit  or  create  a  file  containing  the 
SMILES  strings  and  chemical  names  within  PEP  by  selecting  SMILES  from  the  “input  structure 
type”  popup  menu. 

To  input  connection  table  files  into  the  MCI  or  UNIFAC  batch  modules,  you  must  first  place 
them  in  a  single  folder.  Select  the  “Connection  tables”  option  from  the  “Select  Input  Type”  popup 


30 


button  and  choose  the  folder  that  contains  the  connection  table  files  using  the  standard  “open  file” 
dialog  box  that  appears.  Highlighting  any  one  of  the  files  in  the  folder  selects  all  of  the  files  in  that 
folder  and  allows  you  to  view  or  delete  specific  files.  The  files  can  be  any  valid  type  for  the  MCI 
module  as  they  will  be  converted  if  possible  (see  MCI  module  helps).  After  selecting  the  input 
files,  click  on  the  advance  arrow  located  at  the  upper  right  of  the  card  to  advance  to  the  next  step. 

The  TSA  batch  module  operates  in  the  same  manner  as  the  MCI  and  UNIFAC  modules  except 
that  the  calculation  of  TSA  requires  the  three-dimensional  cartesian  coordinates  for  each  atom  in  the 
chemical  of  interest.  The  TSA  batch  module  accepts  Cartesian  Coordinates  or  Alchemy  files. 
Alchemy  files  contain  both  the  two-dimensional  chemical  structure  and  the  coordinates.  This 
allows  PEP  to  calculate  the  chemical’s  TSA  and  determine  the  most  appropriate  TSA-property 
relationship  based  on  chemical  class.  From  within  the  standard  dialog  box,  you  can  click  on  any 
file  in  a  folder  to  select  all  of  the  files  in  that  folder.  The  files  will  then  be  displayed.  You  can 
also  view  or  delete  files. 

Once  the  input  files  have  been  selected,  the  “output  option”  step  becomes  active.  This  allows 
the  user  to  select  the  properties  to  be  calculated  and  any  additional  information  that  is  available  (i.e. 
chemical  name,  SMILES  string,  MCIs,  TSA,  or  UNIFAC  activity  coefficients)  to  be  exported  by 
to  a  tab-delimited  text  file. 

After  the  output  information  is  selected,  the  “Start  Batch  Driver”  button  becomes  active. 
Clicking  this  button  brings  up  the  standard  Macintosh  “save  file”  dialog  box  that  allows  the  user  to 
specify  the  name  of  the  data  file  to  be  exported  by  PEP  batch  and  the  location  that  the  file  will  be 
sent. 
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SUMMARY 


A  microcomputer  program  for  estimating  physical/chemical  properties  of  organic  chemicals  for 
use  in  environmental  fate  modeling  has  been  described.  The  Property  Estimation  Program 
(PEP)  and  associated  physical  property  database  was  developed  using  HyperCard  for  the  Apple 
Macintosh  series  of  computers.  The  PEP  system  utilizes  both  QSPRs  and  QPPRs  to  provide  the 
user  with  several  approaches  to  estimate  S,  Kow,  Pv,  H,  Koc  and  BCF  depending  on  the 
information  available.  While  QPPRs  have  been  used  by  both  experts  and  non-experts  for 
estimating  properties,  one  of  the  major  limitations  in  using  QSPRs  has  been  the  difficulty  in  using 
the  necessary  software  tools.  The  graphical  interface  and  flow  chart  design  of  PEP  leads  the  user 
through  a  series  of  logical  steps  designed  to  provide  even  non-experts  with  a  economical,  easy  to 
use  software  system  for  property  estimation.  The  structural  information  for  the  MCI  and  UNDFAC 
modules  can  be  input  using  Simplified  Molecular  Input  Line  Entry  System  (SMILES)  notation  or 
connection  tables  generated  from  a  commercially  available  two-dimensional  drawing  program.  The 
TSA  module  accepts  3-D  cartesian  coordinates  entered  manually  or  directly  reads  coordinate  files 
generated  by  molecular  modeling  software.  For  each  property  the  user  can  select  from  either 
"universal"  or  class  specific  regression  models.  The  program's  built  in  intelligence  helps  the  user 
choose  the  most  appropriate  QSPR  based  on  the  structure  of  the  chemical  of  interest.  In  addition, 
sufficient  statistical  information  is  provided  to  allow  the  user  to  determine  on  the  validity  of  the 
QSPRs  and  QPPRs  utilized  in  PEP.  Designed  to  make  the  program  both  practical  and  educational, 
on  line  documentation  is  provided  not  only  for  the  operational  characteristics  of  the  program  but 
also  for  the  theory  associated  with  the  property  estimation  techniques. 

The  combination  of  the  various  property  estimation  methods,  chemical  property  database,  and 
simple  environmental  fate  models  provides  users  with  a  methodology  for  predicting  the 
environmental  distribution  of  an  organic  chemical  in  a  multi-phase  system  requiring  only  the 
structure  of  the  chemical  of  interest  as  input 
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INTRODUCTION 


Background 

Mathematical  models  are  often  used  by  environmental  scientists  and  engineers  to  estimate  the 
fate  and  impact  of  organic  chemicals  in  the  environment  Use  of  these  models  requires  a  variety  of 
parameters  describing  site  and  chemical  characteristics.  Aqueous  solubility  (S),  the  octanol/water 
partition  coefficient  (Kow),  the  organic  carbon  normalized  soil/water  sorption  coefficient  (Koc), 
vapor  pressure  (Pv),  Henry's  Law  constant  (H),  and  bioconcentration  factor  (BCF)  are  considered 
key  properties  used  to  assess  the  mobility  and  distribution  of  a  organic  chemical  in  environmental 
systems. 

One  major  limitation  to  the  use  of  environmental  fate  models  has  been  the  lack  of  suitable 
values  for  many  of  these  properties.  The  scarcity  of  data,  due  mainly  to  the  difficulty  and  cost 
involved  in  experimental  determination  of  such  properties  for  an  increasing  number  of  synthetic 
chemicals,  has  resulted  in  an  increased  reliance  on  the  use  of  estimated  values. 

Quantitative  Property-Property  Relationships  (QPPRs),  based  on  the  relationship  between  two 
properties  as  determined  by  regression  analysis,  are  used  to  predict  the  property  of  interest  from 
another  more  easily  obtained  property.  Quantitative  Structure-Property  Relationships  (QSPRs) 
often  take  the  form  of  a  correlation  between  a  structurally  derived  parameter(s),  such  as  molecular 
connectivity  indices  (MCIs)  or  total  molecular  surface  area  (TSA)  and  the  property  of  interest 

Selection  and  application  of  the  most  appropriate  QPPRs  or  QSPRs  for  a  given  compound  is 
based  on  several  factors  including:  the  availability  of  required  input,  the  methodology  for 
calculating  the  necessary  structural  or  topological  information,  the  appropriateness  of  correlation  to 
chemical  of  interest  and  an  understanding  of  the  mechanisms  controlling  the  property  being 
estimated. 

Incorporation  of  QPPRs  and  QSPRs  into  a  computer  format  is  a  logical  and  necessary  step  to 
gain  full  advantage  of  the  methodologies  for  simplifying  fate  assessment 
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PEP  Overview 

A  Property  Estimation  Program  (PEP),  utilizing  MCI-property,  TSA-property  and  property- 
property  correlations  and  UNIFAC-derived  activity  coefficients,  has  been  developed  for  the  Apple 
Macintosh  microcomputer  to  provide  the  user  with  several  approaches  to  estimate  S,  Kow,  Pv,  H, 
Koc  and  BCF  depending  on  the  information  available. 

Structural  information  required  for  the  MCI  and  UNEFAC  calculation  routines  can  be  entered 
using  either  Simplified  Molecular  Identification  and  Line  Entry  System  (SMILES)  notation  or 
connection  tables  generated  with  commercially  available  two-dimensional  drawing  programs.  The 
TSA  module  accepts  3-D  atomic  coordinates  entered  manually  or  directly  reads  coordinate  files 
generated  by  molecular  modeling  software.  The  program's  built-in  intelligence  helps  the  user 
choose  the  most  appropriate  QSPR  or  QPPR  based  on  the  structure  of  the  chemical  of  interest.  In 
addition,  the  statistical  information  associated  with  each  QSPR  or  QPPR  in  PEP  can  be  displayed 
to  help  the  user  determine  the  model's  validity.  For  the  regression-based  property  estimation 
models,  assessments  of  accuracy  based  on  the  95%  confidence  interval  and  estimated  precision  of 
the  experimental  values  are  also  provided  along  with  the  estimated  property  value. 

PEP  also  provides  a  batch  mode  that  provides  users  with  a  method  for  the  convenient, 
unattended  calculation  of  MCIs,  TSA  and  UNIFAC  activity  coefficients  and  the  subsequent 
estimation  of  physical  properties  for  large  numbers  of  compounds. 

A  chemical  property  database,  containing  experimental  values  of  S,  Kow,  H,  Pv,  Koc,  and 
BCF  complied  from  a  variety  of  literature  sources  and  computerized  databases  was  used  for 
developing  the  MCI-property,  TSA-property  and  property-property  relationships  used  in  PEP. 
This  database,  which  currendy  contains  over  800  chemicals,  is  linked  direcdy  to  PEP. 

The  property  estimation  modules  in  PEP  are  also  linked  directly  to  the  Level  1  and  2  Fugacity 
Models.  The  combination  of  the  various  property  estimation  methods,  chemical  property  database, 
and  simple  environmental  fate  models  provides  users  with  a  methodology  for  predicting  the 
environmental  distribution  of  an  organic  chemical  in  a  multi-phase  system  requiring  only  the 
structure  of  the  chemical  of  interest  as  input 

PEP  was  designed  to  be  intuitive  and  user  friendly.  The  easiest  way  to  become  familiar  with 
the  PEP  is  to  try  clicking  on  the  buttons  and  pull  down  menus  found  on  each  card.  Any  comments 
or  suggestions  regarding  improving  the  operation  of  PEP  would  be  greatly  appreciated  by  the 
authors. 


PEP  Features 


•  Developed  using  HyperCard™  for  the  Apple  Macintosh  series  of  personal  computers 

•  Comprised  of  a  chemical  property  database  and  four  property  estimation  modules 

•  Uses  standard  Macintosh  operations  (buttons,  menus,  windows) 

•  Simple  user  interface  based  on  flow  chart  design 

•  Four  property  estimation  methods  are  available: 

•  Molecular  Connectivity  Indices  (MCIs)-property  correlations 

•  Total  Surface  Area  Regressions  (TSA)-property  correlations 

•  Property-Property  Correlations 

•  UNIFAC  derived  activity  coefficients 

•  PEP  can  be  used  to  estimate  six  chemical/physical  properties 

•  Solubility  (S) 

•  Octanol-water  partition  coefficients  (Kow) 

•  Henry’s  Law  Constant  (H) 

•  Vapor  Pressure  (Pv) 

•  Organic  carbon  normalized  soil- water  distribution  coefficients  (Koc) 

•  Bioconcentration  factors  (BCF) 

•  Universal  and  class  specific  regression  models  are  available 

•  PEP  uses  decision  support  for  determination  of  chemical  class. 

•  Estimates  include  95%  prediction  interval  for  each  regression  based  estimated  value 

•  Statistical  information  readily  available  for  each  regression 

•  New  regression  models  can  be  easily  added 

•  Database  contains  over  800  chemicals  having  at  least  one  property  values 

•  Each  chemical  has  at  least  one  property  and  a  two-dimensional  SMILES  string 

•  Chemicals  in  database  can  be  search  for  by  chemical  name,  CAS  number,  synonym,  or  selected 
from  an  alphabetized  list 

•Property  estimation  modules  and  property  database  are  linked  directly  to  the  Fugacity  Level  1  and 
2  environmental  fate  models 

•  Published  and  on-line  documentation 

•  Includes  PEP  tutorial 

•  Includes  PEP  Batch  for  estimating  properties  or  calculating  MCIs,  TSA,  or  UNIFAC  activity 
coefficients  for  large  numbers  of  chemicals  without  continuous  user  input 
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•  What  Do  I  Need  To  Use  PEP  (i.e.  Hardware  requirements)? 

1.  Macintosh  II  computer  or  better  with 

2.  4,000  Kbytes  (4  Megabytes)  of  usable  hard  disk  space, 

3.  3  Meg  of  RAM  installed, 

4.  running  system  software  6.0.5  or  higher,  and 

5.  HyperCard  2.0  software  or  higher  installed  with  the  size  allocated  to  1500  MB. 

•  Installation  of  PEP 

PEP  is  typically  shipped  on  one  3.5  inch  1.44  Megabyte  floppy  disk.  To  install  PEP: 

1.  Insert  the  PEP  disk  into  the  disk  drive. 

2.  Drag  the  file  PEP.sea  to  the  hard  drive  that  PEP  is  to  be  installed  on.  When  the  PEP.sea 
file  has  been  copied  to  the  hard  drive  you  can  eject  the  disk  by  dragging  the  “PEP”  disk  icon  to 
the  trash. 

3.  Double  click  on  the  icon  of  the  “PEP  sea”  file.  This  will  start  the  installation  process  by 
first  creating  a  new  folder  called  “PEP  system”  on  the  hard  drive  and  then  uncompacting  five 
HyperCard  stacks:  J)  “Chemical  Property  Data  Base”,  (2)  “PEP  Processor”,  (3)  “PEP  Help”, 
(4)  “PEP  Models”,  and  (5)  “PEP  Batch”  (not  necessarily  in  that  order). 

4.  Drag  the  “PEP.sea”  icon  to  the  trash  and  remove  it  from  your  hard  drive  using  the  “empty 
the  trash”  command  which  can  be  found  under  the  Special  pull  down  menu  located  at  the  top  of 
the  screen. 

The  installation  process  is  now  complete.  Test  the  installation  by  double  clicking  on  the  “PEP 
system”  folder,  then  double  clicking  on  the  file  “Chemical  Property  Data  Base”.  If  installation  was 
successful  the  opening  card  of  PEP  will  appear.  If  you  have  a  problem  with  the  installation  please 
contact  Mark  Holt  at  (801)750-3916  or  Bill  Doucette  at  (801)750-3178.  Note:  PEP  can  also  be 
sent  on  two  3  J  inch  800k  floppy  disks  if  requested. 
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General  Programming  Description 


The  PEP  software  system  is  a  HyperCard™  based  program  that  runs  on  Apple  Macintosh 
computers.  HyperCard,  which  is  bundled  with  most  Macintosh  computers  sold,  offers  graphics, 
information  storage,  the  means  to  display  information  in  a  variety  of  formats,  the  ability  to 
establish  links  between  related  information,  a  high  level  language  (HyperTalk),  the  ability  to  extend 
HyperTalk  by  writing  new  commands  in  a  compiled  language  (i.e.  C  or  Fortran)  and  a  mechanism 
to  transfer  control  to  other  Macintosh  applications.  The  PEP  system  uses  all  these  features. 

HyperCard  treats  each  screen  full  of  information  as  a  card  and  each  set  of  related  cards  as  a 
stack.  Cards  can  contain  fields  for  data  and  buttons  for  action  procedures  to  operate  on  the  data  in 
the  fields.  This  allows  the  standard  Macintosh  interface  to  be  used  without  the  direct  use  of  the 
Macintosh  toolbox  routines,  greatly  simplifying  programming.  In  order  to  create  a  user  interface, 
the  programmer  simply  draws,  or  creates  the  buttons  or  fields  that  are  to  be  used.  The  link 
between  buttons,  fields  and  cards  is  done  through  HyperTalk.  HyperTalk  is  an  high-level, 
interpreted  language  used  to  establish  links  between  related  information  and  perform  simple 
calculations  within  HyperCard.  However,  large  repetitive  tasks  and  complicated  computations  can 
be  very  slow  if  HyperTalk  is  used.  HyperCard  also  allows  the  programmer  to  create  extensions  of 
HyperTalk  in  a  lower  level  language.  These  extensions,  called  external  functions  (XPCN)  and 
external  commands  (XCMD),  greatly  increase  the  speed  of  repetitive  and  calculation  intensive 
algorithms  over  using  HyperTalk  itself  and  can  also  be  used  to  implement  custom  Macintosh 
features  such  as  popup  menus  and  custom  dialog  boxes. 

Starting  the  PEP  Software  System 

If  the  installation  of  PEP  (as  described  on  the  previous  page)  was  successful,  a  new  folder 
called  “PEP  system”,  containing  five  HyperCard  stacks  (“Chemical  Property  Data  Base”,  “PEP 
Processor”,  “PEP  Help”,  “PEP  Models”,  and  “PEP  Batch”),  should  appear  on  your  hard  drive. 

PEP  is  started  from  the  Macintosh  operating  system  by  clicking  twice  (double  clicking)  on  any 
one  of  the  five  stack  icons,  expect  the  PEP  batch  stack,  shown  in  Figure  1.  The  five  stacks  must 
be  in  the  same  folder  on  a  hard  disk.  This  will  open  the  PEP  Processor  stack  and  display  the 
opening  card  shown  in  Figure  2.  PEP  can  also  be  used  by  opening  any  HyperCard  stack  and  then 
choosing  “Open”  from  the  “File”  menu  and  selecting  the  “PEP  Processor”  stack. 
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Figure  1 .  PEP  stack  icons. 
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Figure  2.  PEP  opening  screen. 
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Menus,  Buttons  and  Icons 


In  Macintosh  applications  most  cursor  movements  are  accomplished  by  use  of  the  mouse  or 
trackball.  Action  buttons  are  operated  by  positioning  the  “hand”  cursor  over  the  button  area  and 
then  depressing  and  releasing  the  mouse  button.  This  is  referred  to  as  clicking.  Popup  and  pull 
down  type  menus  are  operated  by  positioning  the  hand  cursor  over  the  button  and  then  depressing 
and  holding  the  mouse  button.  While  holding  down  the  mouse  button  move  the  cursor  over  the 
desired  menu  item  and  release  (Note:  on  slower  machines  such  as  the  Mac  Plus,  the  menu 
selections  may  be  slow  to  appear,  but  be  patient,  they  will  eventually  show  up.).  Figure  3  shows 
the  steps  involved  in  using  a  menu  for  the  selection  of  a  command.  The  action  of  moving  the 
mouse  while  holding  down  the  mouse  button  is  called  dragging. 
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1.  Menu  Bor  before  the  mouse  Is  pressed 

2.  Menu  Bor  after  the  mouse  Is  pressed 

3.  Menu  Bor  after  the  mouse  Is  moved  over  the  desired  commend 
but  before  the  mouse  button  Is  released. 


Figure  3.  The  steps  for  using  a  menu  for  command  selection 

The  PEP  software  system  is  comprised  of  five  HyperCard  stacks  that  are  linked  together  by 
various  menus  and  buttons.  The  pull  down  menus,  located  at  the  top  of  each  card  and  buttons 
positioned  at  various  card  locations  allow  the  user  to  navigate  through  PEP. 

The  File  and  Edit  menus  at  the  top  of  each  PEP  card  duplicate  the  general  File  and  Edit  menus 
in  HyperCard  (Please  refer  to  the  HyperCard  manual  for  complete  instructions.)  The  Go  menu 
allows  the  user  to  move  to  the  either  of  the  four  property  estimation  modules,  the  chemical  property 
database,  the  fugacity  model  or  the  batch  mode. 
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In  addition  to  the  menu  items  found  at  the  top  of  each  PEP  card,  buttons,  such  as  those 
displayed  below,  can  be  found  on  each  of  the  various  PEP  cards.  Figure  4  shows  each  button 
icon,  its  title  and  its  action. 


Icon  Title 

Action 

PEP  Icon 

Shows  Opening  Screen  of  PEP 

Return  Arrow 

Takes  you  back  to  the  card  you 
were  at  prior  to  this  one 

(?) Help 

Shows  Help  for  the  current  card 

Information 

Shows  general  information 

Eye 

Shows  the  equations  or  statistics 

Book 

Shows  the  reference 

|<^3  First  Card 

Takes  you  to  the  opening  card  of 
HyperCard 

|  Pop  Up  Button 

1  Lets  the  user  choose  from  a  popup 
i  menu  list 

Rction  Button 

]  Initiates  some  action  or  calculation 

Figure  4.  Buttons  and  icons  used  in  PEP. 


Tutorial 

The  “Click  Here  for  Tutorial”  button,  accessible  on  the  opening  screen  of  PEP,  takes  the  user 
to  the  opening  screen  of  the  tutorial.  As  shown  in  Figure  5,  this  screen  takes  the  form  of  a  flow 
chart  depicting  the  overall  design  of  PEP.  Individual  tutorials,  available  for  the  chemical  property 
database  and  each  of  the  four  property  estimation  modules,  can  be  activated  by  clicking  on  the 
appropriate  button  in  the  flow  chart.  Each  tutorial  automatically  runs  the  chosen  component  of  the 
PEP  system  while  illustrating  its  operation  in  a  step-by-step  manner. 
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Figure  5.  Opening  screen  of  PEP  tutorial  (Flow  chart  illustrating  overall  design  of  PEP  system). 
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REFERENCE  SECTION 


PEP  Processor 

The  PEP  Processor  stack  contains  the  algorithms  for  data  input,  calculations  and  output  of  the 
physical-chemical  properties  estimated  using  MCI-Property,  TSA-Property,  and  Property-Property 
correlations  and  UNEFAC  derived  activity  coefficients.  The  PEP  Processor  is  divided  into  four 
modules,  one  module  for  each  of  these  estimation  methods. 

The  user  interface  for  each  module  is  designed  as  a  flow  chart  consisting  of  a  series  of 
numbered  steps  or  actions  required  to  estimate  the  physical-chemical  property(s)  of  interest.  The 
steps  are  numbered  sequentially  from  left  to  right  and  top  down.  Initially,  all  but  the  first  step  is 
dimmed.  As  the  user  completes  the  first  step  the  next  step  will  darken,  indicating  the  appropriate 
progression.  However,  during  any  step  a  darkened,  previously  accomplished  step  can  be  redone 
or  the  selection  changed.  Figure  6  shows  this  progression  of  steps. 


Before  step  1 
is  completed 

Before  step  2 
is  completed 


Before  step  3 
is  completed 


After  step  3  is 
completed 


Figure  6.  Example  illustrating  the  use  of  the  PEP  Processor’s  flow  chart  interface. 
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MCI  Module 


Overview 

The  user  interface  of  the  MCI  module,  like  the  other  property  estimation  modules  in  PEP,  is 
designed  in  the  form  of  a  flow  chart  depicting  the  steps  that  must  be  completed  in  order  to  use  the 
module.  As  illustrated  in  Figure  7  by  the  fully  darkened  MCI  module  card,  the  four  steps  that  must 
be  completed  before  the  selected  physical -chemical  properties  can  be  estimated  using  are:  (1)  input 
the  necessary  structural  information  for  the  chemical  of  interest  using  either  SMILES  notation  or  a 
connection  file,  (2)  calculate  and  display  the  MCIs,  (3)  select  the  properties  to  estimate  and  (4) 
choose  the  most  appropriate  MCI-property  regression  model. 


Figure  7.  Screen  display  of  PEP  MCI  module. 
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Entering  Chemical  Structure 


Upon  entering  the  MCI  module,  only  the  first  step  in  the  flow  chart  is  active  as  indicated  by  its 
darkened  status.  The  two  dimensional  molecular  structure  of  the  chemical  of  interest  is  needed  to 
calculate  the  MCIs.  You  must  first  input  the  necessary  structural  information  using  either  SMILES 
[33,34]  notation  or  connection  files  before  you  can  continue  to  the  second  step.  Select  either 
option  from  the  “Input  Structure”  popup  button. 

If  you  select  SMILES,  two  blank  lines  appear,  one  for  the  chemical  name  (optional)  and  one 
for  die  SMILES  string.  Once  you  have  entered  the  SMILES  string,  click  the  “OK”  button  or  the 
carriage  return.  The  SMILES  string  must  conform  to  the  standard  set  by  Anderson,  Veith,  and 
Weininger  (1987)  and  Weininger  (1988)  with  the  following  exceptions:  a  single  bond  connecting 
aromatic  rings  must  be  explicitly  denoted  by  an  the  SMILES  string  must  be  Hydrogen 
suppressed,  and  the  SMILES  string  cannot  contain  any  “{}”  or  “[]”  qualifiers.  SMILES  is  a 
chemical  notation  language  specifically  designed  for  computer  use.  It  is  a  method  of  "unfolding"  a 
2D  chemical  structure  into  a  single  line  of  characters  containing  the  structural  information.  For 
users  unfamiliar  with  SMILES  notation,  a  detailed  description  describing  its  use  can  be  found  in 
scrollable  window  directly  below  the  SMILES  string  input  line. 

If  a  connection  table  is  chosen  as  the  input  method,  the  standard  Macintosh  file  selection  dialog 
box,  as  shown  in  Figure  8,  will  be  used  to  select  the  file.  This  requires  that  a  connection  table  has 
already  been  created  for  the  chemical  of  interest. 


Select  file 


Figure  8.  Standard  Macintosh  file  selection  dialog  box. 


Connection  tables  can  be  generated  from  commercially  available,  two-dimensional  (2D) 
chemical  drawing  programs,  such  as  ChemDraw™,  Chemintosh™,  or  I SIS/Draw™,  that  have  the 
ability  to  save  the  structure  as  a  connection  table  file.  The  connection  table  file  must  be  formatted 
the  same  as  a  connection  table  from  ChemDraw  (1989).  An  example  of  a  ChemDraw  compatible 
connection  table  is  shown  in  Figure  9. 


Title  line 
Number  of  Atoms 
Number  of  Connnections . 


X,Y,Z  Coordinates 
(Not  used) 


Atom  Symbol 


Atom  Numbers 
Bond  Type 
Not  Used 


jExample  connection  table 
•KUO 

2.29167 
3.12500 
^167 
3.12500 


7.62500 

7.62500 


9.04167 
9.04167, 
T33333 
8.33333 
9.16667 
10.00000 
10.83333 


1 .87500 
1.04167 
1.04167 
1.04167 
1.04167 


0.00000  C 
0JHI000  c 

0.00000  c 

0.00000JD 

oooTc 
0.00000  c 
0.00000  c 
0.00000  c 
0.00000  c 
0.00000  c 


Connection 

Information 


( Atom 
Information 


Figure  9.  Example  connection  table. 

The  first  data  line  contains  the  title  of  the  chemical  or  any  other  identifier.  The  second  line 
consists  of  two  numbers  separated  by  a  comma.  The  first  number  is  the  number  of  atoms  in  the 
connection  table  and  the  second  is  the  number  of  connections  described  in  the  connection  table. 
The  remaining  lines  describe  the  type  of  atoms  in  the  molecule  and  their  location  (atom 
information)  and  the  atoms  to  which  they  are  connected  (connection  information).  The  total 
number  of  lines  depends  on  the  number  of  atoms  in  the  molecule  and  the  number  of  connections. 
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Atom  information  is  contained  in  four  columns  separated  by  one  or  more  spaces.  Columns  one 
through  three  are  the  X,Y,Z  coordinates  of  the  atom  (not  used  for  calculation  of  MCIs)  and  column 
four  contains  the  atom  symbol  (e.g.  C,  Cl,  Br,  N  etc  ).  Connection  information  is  contained  in 
four  columns  of  whole  numbers.  Columns  one  and  two  contain  the  atom  numbers  of  the  atoms 
that  are  connected  (atoms  are  numbered  consecutively),  column  three  contains  the  b-  d  type  (1,2, 
3,  or  4),  and  column  four  usually  contains  a  1  (not  used).  The  bond  type  in  column  three  can  be 
either  “1”  for  a  single  bond,  “2”  for  a  double  bond,  “3”  for  a  triple  bond,  or  “4”  for  an  aromatic 
bond.  {NOTE:  The  Macintosh  compatible  molecular  modeling  program,  Alchemy  II™,  generates 
files  containing  connection  table  information  along  with  3D  atomic  coordinates.  To  use  alchemy 
files  for  input  into  the  MCI  module  simply  treat  the  alchemy  file  as  a  “ChemDraw  Connection 
Table”.) 

As  described  in  the  next  section,  MCIs  are  calculated  from  the  hydrogen  suppressed  structure 
of  a  chemical.  Consequently,  for  the  most  efficient  calculation  of  MCI,  the  connection  tables 
created  for  input  into  the  MCI  module  should  be  created  from  hydrogen  suppressed  structures.  If 
the  connection  tables  are  not  hydrogen  suppressed  PEP  will  automatically  remove  them.  This  can 
result  in  a  significant  increase  in  the  time  required  to  calculate  the  MCIs. 

Calculating  MCIs 

As  soon  as  the  structure  is  entered,  the  second  step  becomes  active.  Clicking  on  the  “Calc. 
MCIs”  button  starts  the  calculation  of  the  MCIs.  If  the  structure  was  entered  as  a  SMILES  string 
the  “mciSmile”  XFCN  converts  the  SMILES  string  into  the  proper  format  for  the  “meichi”  XFCN 
which  calculates  the  MCIs.  Similarly,  if  a  connection  table  was  used  for  input  the  “mciConvert” 
XFCN  converts  the  connection  table  into  the  proper  format.  The  MCIs  that  are  calculated  are  listed 
along  with  the  variable  names  in  Table  1.  Once  the  MCIs  have  been  calculated,  they  can  be 
viewed,  exported,  or  printed  by  using  the  “Display  MCIs”  button  under  the  “Calc.  MCIs”  button. 
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Table  1.  MCIs  calculated  by  PEP. 


Variable  Name 

MCI  Tide 

Orders  calculated 

np 

normal  path 

0  through  6 

ncl 

normal  cluster 

3  through  6 

nch 

normal  chain 

3  through  6 

npc 

normal  path/cluster 

4  through  6 

bp 

bond  path 

0  through  6 

bcl 

bond  cluster 

3  through  6 

bch 

bond  chain 

3  through  6 

bpc 

bond  path/cluster 

4  through  6 

vp 

valence  path 

0  through  6 

vcl 

valence  cluster 

3  through  6 

vch 

valence  chain 

3  through  6 

vpc 

valence  path/cluster 

4  through  6 

Avp 

delta  valence  path 

0  through  6 

How  PEP  calculates  MCIs 

To  calculate  the  MCIs  for  a  given  compound,  a  delta  (d)  value  are  first  assigned  to  each  non¬ 
hydrogen  atom  in  the  structure.  Three  d  values  were  computed  in  this  study:  normal,  bond,  and 
valence.  Normal  deltas  are  computed  by  summing  the  number  of  bonds  (single,  double,  etc.  are 
counted  as  one  bond)  connected  to  the  atom  whose  delta  is  being  calculated.  The  bond  deltas  are 
calculated  the  same  as  the  normal  deltas  except  the  bonds  were  taken  at  their  face  value  (single  is 
one,  double  is  two,  etc.)  instead  of  each  bond  being  equal  to  one.  Valence  deltas  for  each  atom  are 
computed  according  to  equations  (1)  and  (2)  (Kier  and  Hall,  1986): 

dv  =  Zv-h  (1) 

dv  =  (Zv  -  h)/(z  -  ZV)  (2) 

where  dv  is  the  valence  delta,  Zv  is  the  number  of  valence  electrons  in  the  atom,  h  is  the  number  of 
hydrogen  atoms  bound  to  the  atom,  and  Z  is  the  atomic  number  of  the  atom.  Equation  ( 1)  is  used 
for  those  atoms  in  the  first  row  of  the  periodic  chart,  and  equation  (2)  is  used  for  all  other  atoms. 

Once  the  delta  values  have  been  calculated  for  each  atom  in  the  molecule,  simple,  bond  and 
valence  indices  of  different  orders  and  types  can  be  calculated.  The  order  refers  to  the  number  of 
bonds  in  the  skeletal  substructure  of  fragment  used  in  computing  the  index:  zero  order  defines 
individual  atoms,  first  order  used  individual  bond  lengths,  second  order  uses  two  adjacent  bond 
combinations,  and  so  on.  The  type  refers  to  the  structural  fragment  (path,  cluster,  path/cluster  or 
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chain)  used  in  computing  the  index.  A  more  detailed  explanation  of  the  calculation  of  MCIs  can  be 
found  in  Kier  and  Hall  (1986). 

The  MCI  calculation  routine  in  PEP  calculates  simple,  bond  and  valence  indices  of  several 
types  (path,  cluster,  chain,  and  path/cluster)  and  orders  (0  through  6),  if  possible,  for  each 
molecule,  resulting  in  a  maximum  of  54  index  values  for  each  molecule. 

To  account  for  non-dispersive  force  effects  on  aqueous  solubility  and  solubility  related 
properties  zero  through  six  order  A  valence  path  indices  (Ax),  as  described  by  Bahnick  and 
Doucette  (1988),  are  calculated  by  PEP,  in  addition  to  the  54  indices  described  above.  To  calculate 
A%  indices,  a  nonpolar  equivalent  is  made  by  substituting  C  for  O  or  N  atoms.  MCIs  are  calculated 
for  the  nonpolar  equivalent  and  values  for  Ac  can  be  computed  for  each  type  of  index  by: 

&X  =  0c)np  -  X  (3) 

where  Ax  is  the  delta  index,  (X)np  *s  index  for  the  non-polar  molecule  and  x  is  the  index  for  the 
original  molecule. 

Choosing  the  Properties 

After  the  MCIs  have  been  calculated,  the  third  step  becomes  active.  Select  the  property  or 
properties  you  would  like  to  estimate  by  clicking  on  the  check  box  button  next  to  it.  You  can 
simultaneously  select  all  the  properties  by  holding  down  the  shift  key  and  clicking  any  one  of  the 
property  buttons. 

Choosing  the  Regression  Models  and  Chemical  Classes 

After  the  properties  are  selected,  step  four  becomes  active  and  the  regression  models  available 
for  each  property  are  displayed  in  a  popup  menu.  Two  categories  of  MCI-property  relationships 
are  displayed  for  each  property.  The  first  category  of  MCIs  property  relationships,  preceded  with 
the  word  PEP,  were  developed  in  this  project  using  the  experimental  values  reported  in  the  PEP 
property  database.  “Universal”  MCI-property  relationships  were  developed  using  all  available 
experimental  data  for  a  given  property  regardless  of  chemical  class.  “Class-specific”  MCI-property 
relationships  were  developed  if  property  values  were  available  for  a  sufficient  number  (10  or 
greater)  of  compounds  within  a  particular  chemical  class  (i.e.  PCBs,  PAHs,  ureas).  In  addition, 
several  multi-class  MCI-property  correlations  were  developed  for  more  broad  classes  of 
compounds  such  as:  halogenated  aiiphatics  and  halogenated  aromatics. 
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The  second  category  of  MCI-property  relationships  displayed  for  each  property  were  obtained 
directly  from  the  literature  and  are  located  below  the  PEP  relationships,  separated  by  a  gray  line,  in 
the  popup  menu.  By  clicking  the  “book”  found  at  the  left  of  each  literature  MCI-property 
regression  model,  the  coefficients  ^  value  and  the  appropriate  citation  can  be  can  be  displayed 

To  illustrate  the  potential  hierarchy  of  MCI-property  relationships  available  to  the  user,  an 
example  for  the  predicting  the  vapor  pressure  (Pv)  of  a  polychlorinated  biphenyl  (PCB)  is  provided 
below.  The  are  three  appropriate  PEP-derived  MCI-property  relationships  available  to  the  user, 
one  developed  using  only  PCBs,  one  using  halogenated  aromatics  including  PCBs  and  one  using 
all  compound  types: 


log  Pv  =  5.814  (nc5)  -  2.428  (np3)  +  9.479  (PCBs) 


log  Pv  = -1.559  (bpl)  +  6.622 


(Halogenated  aromatics) 


log  Pv  = -1.275  (np3)  +5.261 


(Universal) 


Generally,  the  use  of  a  “class-specific”  relationship,  if  available,  should  provide  the  best 
estimate  (i.e.  the  estimate  associated  with  the  least  amount  of  uncertainty).  To  automatically  aid 
you  in  choosing  the  most  appropriate  MCI-property  relationships,  PEP  looks  in  the  SMILES  string 
or  connection  file  for  groups  of  atoms  and  bonds  that  distinguish  various  chemical  classes.  The 
number  of  MCI-property  relationships  or  chemical  classes  that  are  chosen  by  the  program  is 
determined  by  the  number  of  different  distinguishing  subgroups  that  are  found.  For  the  example 
shown  above,  the  most  appropriate  regression  model,  PCBs,  would  be  made  the  default  equation. 
The  two  other  appropriate  models,  halogenated  aromatics  and  Universal,  would  be  denoted  with  a 
♦  in  the  popup  menu.  If  the  compound  entered  into  PEP  does  not  fit  one  of  the  class  specific 
models  the  “Universal”  equation  is  selected  as  the  default.  You  may  also  choose  to  ignore  the 
regression  model  chosen  by  PEP  and  select  your  own. 

MCI-property  regression  models  are  available  for  the  following  “classes”  of  chemicals: 
Universal,  Universal  Nonionizable,  Universal  Ionizable,  Alcohols,  Anilines,  Carbamates, 
Halogenated,  Aliphatics,  Nonhalogenated  Aliphatics,  Halogenated  Aromatics,  PCBs,  PAHs, 
Phenols,  Triazines,  and  Ureas.  Examples  of  representative  chemicals  for  each  of  the  classes  can 
be  found  in  the  View  menu.  To  see  the  structures  of  the  chemicals  click  on  the  class  name.  Not  all 
of  the  classes  listed  above  are  implemented  for  each  property.  The  models  not  available  for  the 
properties  to  be  estimated  are  dimmed  in  the  popup  menu. 
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Statistics  Cards 


A  summary  of  the  regression  statistics  and  list  of  compounds  used  to  develop  and  evaluate  each 
MCI-property  relationship  can  be  displayed  by  clicking  the  “eye”  or  “view  statistics  option”  found 
at  the  left  of  each  regression  model.  Information  displayed  on  the  statistics  card  includes:  the  MCI- 
property  regression  equation,  the  list  of  chemicals  used  in  developing  the  regression  model,  the 
standard  errors  of  the  coefficients  in  the  regression  equation,  the  Analysis  of  Variance  (ANOVA) 
table,  the  r^  value,  a  graph  of  the  the  predicted  vs.  estimated  values,  a  graph  of  the  residuals  vs. 
the  predicted  values,  a  graph  of  the  residuals  vs.  the  number  of  standard  deviations,  a  normal 
probability  plot  of  the  residuals,  the  X’X  inverse  matrix  and  appropriate  reference.  An  example  of 
the  statistical  information  provided  for  each  MCI-property  relationship  is  shown  in  Figure  10. 


PEP  Processor 


m  File  Edit  Go  Print  Misc. 


J  t&ta&ii&ioxibitmtu - :--rj  V-hho mimot I'nVt-WmnW  (to; Vn. 


Regression  Results 


Analysis  of  Variance  Table 


Variable 

Coef. 

Std. 

Error 

t 

Source 

RSS 

df 

MSS  F 

Constant 

0.3917 

0.1376 

2  85 

Regression 

889.176 

2 

445  446 

vpl 

-.9257 

0  0316 

-29  3 

Residual 

360.920 

362 

0  997 

Avpl 

1.8251 

0.1047 

17.4 

Total 

1250.096 

364 

3.4343 

MP-25 

-0.01 

s 

r2=  71  1* 

n0bs  - 

365 

S=  0  9985 

Predicted  vs.  Exp. 


Residual  vs.  Predicted  Residual  vs.  Prob. 


-7  5  -2  5  0  0  2  5 

experimental  log  S 


-6-30  3 


-15  0  0  15 

number  of  standard 
deviations 


A'VAViw r-s.'r. vwvi  vvW- /w* m\  muwa  >  ■*MwWmwW^mA»M<V>www.w*.w<. 


Figure  10.  Statistics  card  associated  with  the  MCI  module. 


The  ANOVA  table  contains  the  degrees  of  freedom,  the  residual  sum  of  squares,  the  residual 
mean  square,  and  the  variance  ratio  (F)  for  regression,  residual  and  the  total  source  of  errors. 


18 


The  X’X  inverse  matrix  is  used  in  PEP  to  calculate  the  prediction  interval  of  an  estimate.  The 
matrix  is  derived  by  pre-multiplying  the  X  matrix  by  its  transpose  and  then  inverting  the  result. 
The  X  matrix  has  a  column  for  each  variable  in  the  regression  equation  and  a  row  for  each 
observation  used  to  calculate  the  regreision  equation.  Each  row  contains  the  value  used  for  each  of 
the  variables  in  the  regression  equation.  For  example,  if  the  regression  equation  is 
Yj=  bO  +  blXjl  +  b2Xj2 

where  j  is  1  to  the  number  of  observations,  Yj  is  the  estimated  values  for  each  observation,  bi 
are  the  regression  coefficients,  and  the  Xji  are  the  values  of  the  variables  t  sed  then  the  X  matrix  is 


1  Xll  X12 

1  X21  x22 

1  Xjl  Xj2. 


The  resulting  X’X  inverse  is  a  square  matrix  with  the  number  of  rows  and  columns  equal  to  the 
number  of  variables  in  the  regression  equations. 

If  the  QSPR  or  QPPR  was  taken  from  the  literature  only  the  input  variables  and  the  statistical 
information  provided  in  the  original  reference  is  included. 

Estimating  the  Properties  and  Viewing  the  Results 

Click  the  “Est.  Property”  button  to  calculate  the  selected  properties.  The  results  card  displays 
the  estimated  properties  and  their  respective  95%  prediction  interval  Note:  95%  prediction 
intervals  are  not  available  for  MCI-property  relationships  taking  from  the  literature.  You  can 
return  to  the  previous  card  by  clicking  on  the  “return”  button  at  the  upper  right  comer  of  the  results 
card.  The  “Go”  menu  can  be  used  to  move  to  another  module.  You  can  compare  the  estimate 
property  values  with  those  contained  in  the  “Chemical  Property  Database”,  if  available,  by  clicking 
on  the  “Look  in  DB”  button.  Clicking  this  button  activates  a  database  “search  by  name”  routine. 
The  name  on  the  results  card  must  match  exactly  the  name  in  the  database  for  the  search  routine  to 
find  the  compound.  If  a  property  value  is  not  found  an  NA  will  be  displayed. 

Adding  or  Deleting  MCI-Property  Regression  Models 

Additional  MCI-property  regression  models  can  be  added  to  the  PEP  MCI  module  through  the 
statistics  cards.  Choosing  the  “New  Stat.  Card”  option  from  the  “Misc.”  menu  of  a  statistics  cards 
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while  the  type  of  statistics  card  that  is  to  be  added  is  the  current  card  The  statistics  for  a  new 
regression  equation  can  be  added  to  PEP  b.  First  the  new  card  must  be  titled.  This  title  will  be 
used  in  the  popup  menus  and  on  the  results  card.  The  First  28  characters  of  the  title  must  be 
unique.  The  second  step  is  to  enter  the  regression  equation.  The  user  will  be  prompted  for  the 
number  of  terms  in  the  equation,  then  prompted  for  each  coefficient  and  the  associated  variable 
with  the  dialog  box  shown  in  Figure  11.  The  variables  which  are  available  for  that  type  of 
statistics  card  will  be  in  a  popup  menu  for  easy,  consistent  selection. 


Choose  MCI  and  Input  Coefficient 


Choose  MCI  :  |  upO  | 


Coefficient: 


-2.3 


Cancel 


Figure  11.  Example  dialog  box  for  the  input  of  new  MCI-property  relationships 


A  relationship  can  easily  be  deleted  from  PEP  by  first  making  that  statistic  card  the  current 
card,  and  then  choosing  “Del.  Stat.  Card”  from  the  “Misc.”  menu.  The  user  will  be  prompted  to 
confirm  the  deletion  and  then  the  regression  list  will  be  rebuilt. 

Limitations  of  MCI-Property  Regression  Models 

Selection  of  the  most  appropriate  MCI-property  relationship  depends  on  the  structure  of  a 
particular  chemical  of  interest,  knowledge  of  the  mechanism  of  the  process,  and  the  extent  of  the 
database  used  to  develop  the  MCI-property  relationship.  For  example,  some  MCI-property 
relationships  are  broader  than  others  in  the  range  of  chemicals  that  are  covered,  and  some  have 
been  established  with  a  better  understanding  of  the  mechanisms  or  properties  involved. 

One  problem  that  has  limited  the  widespread  acceptance  of  MCI-property  correlations  is  that  the 
actual  physical  meaning  associated  with  the  individual  indices  is  not  well  understood.  Frazier 
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(1990)  and  Doucette  and  Holt  (1991),  however,  have  shown  a  strong  correlation  between 
calculated  molecular  surface  area  and  several  MCIs  for  a  variety  of  organic  chemicals.  MC1- 
property  correlations  tend  to  be  class  specific  and  thus  are  highly  dependent  on  the  type  and  range 
of  compounds  that  were  used  to  derive  a  particular  correlation.  Indiscriminate  use  of  such  models 
without  an  examination  of  number  and  type  of  compounds  used  to  develop  the  model  can  result  in 
considerable  error. 
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TSA  Module 


Overview 

As  shown  in  Figure  12,  the  TSA  module  is  similar  in  design  to  the  MCI  module.  However, 
unlike  molecular  connectivity,  the  calculation  of  TSA  requires  information  describing  the  geometry 
of  the  molecule  in  terms  of  its  3-D  atomic  coordinates. 


. .  PEP  Processor 

mumi 

*  File  Edit  Go  Print  Uieui 

■*1| 

TSA 


Chemical  Name:  2.2,.6.6l-tetrachlorobjpheny|j . . . 

SMILES  String: 


1. Input 

Structure 

2.  Calc. 

*  TSRs 

■ 

■ 

■ 

■ 

Edit  van 
der  Vub 
Radii 

display  TSAs 

3.  Choose 
Prop. 

E  s 

E  Kolu 

EPu 

EH 

E  Koc 

EBCF 

1  1  '  ^—^—1  I  !!■ 

Regression  PEP  REF 

Stats 

PCBs  |  ^ 

|  PCBs  |  ^ 

PCBs  | 

Helogensted  Aromatics  | 

Helogensted  Aromatics  | 

j  Universe!  | 

1 5.  Estimate 
Properties 


Figure  12.  TSA  module  card  from  PEP 


Three-dimensional  coordinates  can  be  obtained  from  X-ray  crystallography  data  or  from 
molecular  modeling.  Alchemy  from  Tripos  Associates  (1989)  and  Chern3D+  from  Cambridge 
Scientific  (1989)  are  examples  of  Macintosh  compatible  molecular  modeling  software  that  allows 
the  user  to  draw  a  chemical  structure,  energy  minimize  the  structure,  and  produce  a  file  containing 
the  three-dimensional  coordinates.  The  TSA  module  is  also  designed  to  accept  files  generated  by 
other  hardware/software  combinations  including  UNIX  or  VAX  versions  of  CONCORD  (Tripos 
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Associates,  Inc.,  1990),  a  hybrid  expert  system  and  molecular  modeling  software  designed  for  the 
rapid  generation  of  high  quality  approximate  3-D  molecular  structures. 


Entering  the  Structural  Information 

To  estimate  properties  using  the  TSA  module,  you  must  ftrsi  input  the  necessary  structural 
information  by  using  one  of  the  three  options  available  under  the  popup  menu  titled  “Input 
Structure”:  (1)  Alchemy  file,  (2)  cartesian  coordinates  file,  or  (3)  manually  entered  cartesian 
coordinates.  The  preferred  method  of  structural  input  is  via  Alchemy  files  because  they  contain 
both  the  three-dimensional  structure  and  the  connection  information.  This  information  allows  PEP 
to  calculate  the  chemical’s  TSA  and  determine  the  most  appropriate  TSA-property  relationship 
based  on  chemical  class.  Therefore,  if  an  Alchemy  file  is  chosen  for  the  input  then  only  the  file 
selection  dialog  box  will  be  presented  for  selection  of  the  file.  However,  if  ether  of  the  cartesian 
coordinate  options  are  selected  for  input,  both  the  standard  file  selection  dialog  box  will  be 
presented  and  an  opportunity  to  select  the  connection  table  file  or  enter  a  SMILES  string  will  be 
presented. 

The  format  of  an  Alchemy  file,  shown  in  Figure  13,  is  similar  to  that  of  a  ChemDraw 
connection  table  discussed  previously.  The  first  line  of  an  Alchemy  file  contains  the  number  of 
atoms  followed  by  “ATOMS,”  the  number  of  bonds  followed  by  “BONDS,”  the  number  of 
charges  followed  by  “CHARGES,”  and  then  the  title  of  the  file.  The  next  set  of  lines  contains  six 
columns  of  information  for  each  atom  in  the  molecule  including  the  hydrogen  atoms.  Column  one 
contains  the  atom  numbers,  column  two  contains  the  atomic  symbols,  and  columns  three  through 
five  contain  the  X,Y,Z  coordinates  of  the  atom.  The  set  of  lines  describing  the  atoms  is  followed 
by  a  series  of  lines  containing  four  columns  describing  the  bonds.  The  first  column  contains  the 
bond  number,  the  second  and  third  columns  contain  the  atom  numbers  of  the  two  atoms  connected, 
and  the  fourth  column  contains  the  type  of  bond  either  “SINGLE,”  “DOUBLE,”  “TRIPLE,”  or 
“AROMATIC.” 
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Atom  number! 
Atom  symbol 
X,Y,Z  coordinate: 
not  use: 

Bond  number* 
Atom  numbe 
Bond  ty| 


Number 
of  Atoms 


Number 
of  Bonds 


not  used 


title 


6  ATOMS,  5  BONDS, 


0.0700 
2.2442  1.0786 
0.2306  2.0846 
2  SjNGLE'l 
6UBLE 
SINGLE 
SINGLE 
SINGLE 


0  CHARGES,  TCE 
__  OOOO'I 

o.oooo 
0.0000 
0.0000 
0.0000 


0.2713 

0.7463 

1.0242 

0.0245 


Connection 

Information 


0.0000, 


Atom 

Information 


Figure  13.  Example  Alchemy  file. 

PEP  accepts  cartesian  coordinates  files  having  the  following  format.  The  file  has  one  header 
line  indicating  the  number  of  atoms  in  the  molecule.  The  rest  of  the  lines  in  the  file  describe  each 
atom  in  the  molecule.  Only  the  first  five  columns  of  each  line  are  used.  Column  one  contains  the 
atom  symbols,  column  two  contains  the  atom  numbers,  and  columns  two  through  five  contain  the 
X,Y,Z  coordinates  of  the  atom.  An  example  cartesian  coordinates  file  is  shown  in  Figure  14. 
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2 

Figure  14.  Example  Cartesian  coordinate  file. 
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If  the  option  to  manually  enter  the  coordinates  is  chosen  then  a  card  is  presented,  as  shown  in 
Figure  14,  that  allows  the  coordinates  and  atom  symbol  of  each  atom  to  be  entered  individually 
from  the  keyboard.  The  X,Y,Z,  coordinate  values  and  atom  symbols  for  each  atom  in  the 
molecule  are  entered  in  the  appropriate  labeled  boxes  one  at  a  time.  After  the  information  for  each 
line  is  correctly  entered  the  “Line  OK”  button  is  clicked.  This  enters  the  information  into  the 
scrollable  window  below.  This  process  is  repeated  for  each  atom  in  the  molecule.  When  all  the 
atoms  have  been  entered  clicking  the  “Done”  button  will  send  the  structural  information  to  the  TSA 
module. 


File  Edit  Go  Print  Misc. 
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Fill  in  one  atom  at  a  time  at  the  bottom,  type  Return  or  Tab  to  put 
it  into  the  table.  To  Edit  an  entry  click  on  the  line  in  the  table.  Click 
"All  Done”  ui hen  the  table  is  complete. 
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Figure  15.  Example  card  for  the  entry  of  atomic  coordinates. 


Calculating  Total  Surface  Area  (TSA) 

In  addition  to  the  3-D  molecular  structure,  the  user  must  also  input  van  der  Waals  radii  for  each 
of  the  atoms  before  the  TSA  of  the  molecule  can  be  calculated.  PEP  automatically  enters  a  var.  der 
Waals  radius  for  each  atom  using  values  from  Pauling  (1960).  However,  when  the  “Calc.  TSA” 
button  is  clicked  the  user  has  the  opportunity  to  edit  the  van  der  Waals  radii  using  the  dialog  box 
shown  in  Figure  16. 
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Symbol 

Radius 

A 

C 

1.7 

CL 

1.8 

BR 

1.95 

0 

1.4 

N 

1.5 

H 

1.2 

S 

1.85 

P 

1.9 

F 

1.35 

Soluent 

Radius 


0.0 


Default  values  taken  from : 
Pauling.  1960.  “The  Nature 
of  the  Chemical  Bond." 
Cornell  University  Press. 
Ithaca,  New  York. 


[  Reset  Defaults  ) 


Cancel 


HP 


Enter  0.0  for  Total  Surface 
Area  to  be  calculated 


Figure  16.  Dialog  box  for  editing  van  der  Waal  radii. 


This  dialog  box  also  contains  a  place  to  enter  a  solvent  radius.  If  it  is  left  at  “0.0”  then  the  total 
surface  area  will  be  calculated.  Some  relationships  from  the  literature  require  the  solvent  accessible 
surface  area  to  be  calculated.  If  this  is  the  case,  the  desired  solvent  radius  can  be  entered. 

Once  the  molecular  geometry  and  the  van  der  Waal  radii  are  input,  the  “Calc.  TSA”  button 
becomes  active  and  TSA  can  be  calculated  using  a  XFCN  that  was  adapted  from  the  SALV02 
algorithm  developed  by  Pearlman  (1980).  This  algorithm  represents  each  atom  of  a  molecule  by  a 
sphere  centered  at  the  equilibrium  position  of  the  nucleus.  The  radius  of  the  sphere  is  equal  to  that 
of  the  van  der  Waals  radius.  Planes  of  intersection  between  spheres  are  used  to  estimate  the 
contribution  to  surface  area  from  the  individual  atoms  or  groups.  The  program  computes  the 
surface  area  of  individual  atoms  or  group  by  numerical  integration,  and  the  overlap  due  to 
intersecting  spheres  is  excluded  from  the  calculation.  TSA  is  calculated  by  the  summation  of 
individual  group  contributions.  These  areas  are  then  imported  to  the  “TSAs”  card  shown  in  Figure 
17. 
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72.158 


Not  Using  a  0.0 
Solvent  Radius 

total  area  (ft2) 

Not  Calc. 

total  volume  (A3) 

Not  Calc. 


Figure  17.  Example  TSA  card. 


To  quantify  the  surface  area  attributed  to  the  polar  portions  of  the  molecule  the  surface  areas  of 
nitrogen,  oxygen,  phosphorous,  sulfur,  and  aromatic  nitrogen  atoms  are  individually  separated 
from  the  TSA  and  placed  on  the  “TSAs”  card.  A  more  detailed  description  of  the  TSA  calculation 
method  is  provided  by  Pearlman  (1980). 

Choosing  the  Most  Appropriate  TSA-Property  Regression  Model 

After  the  TSA  has  been  calculated,  you  can  display  the  values  and/or  choose  the  properties  of 
interest  and  a  corresponding  regression  model  using  the  same  approach  described  in  the  MCI 
module.  As  discussed  previously,  “class-specific”  regression  models  generally  yield  estimates 
associated  with  the  least  amount  of  uncertainty.  If  an  Alchemy  file  is  used  to  enter  the  3D  structure 
information  or  if  a  SMILES  string  or  connection  file  is  entered  along  with  the  cartesian 
coordinates,  the  decision  support  system  in  PEP  will  choose  the  most  appropriate  TSA-property 
regression  model(s)  as  described  in  the  MCI  module.  The  most  appropriate  regression  model  will 
be  made  the  default  and  a  ♦  will  be  place  next  to  the  other  appropriate  class  name  in  the  popup 
menu  containing  a  list  of  the  regressions.  If  no  class  specific  regression  models  are  available,  the 
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“Universal”  equation  will  be  made  the  default.  The  operation  of  the  TSA  module  from  this  point 
on  is  identical  to  that  of  the  MCI  module.  Note:  for  solid  solutes  the  melting  point  is  needed  to 
estimate  the  solubility.  When  the  “S”  button  is  clicked,  the  user  is  prompted  with  a  dialog  box  to 
enter  the  melting  point.  Because  the  solubility  will  be  estimated  at  25°  C,  only  if  the  melting  point 
is  above  25°  C  (solid  solute)  will  the  value  be  required.  If  the  melting  point  is  below  25°  C  it  is  set 
equal  to  25°  C.  At  this  time  only  the  “<  25°  C”  and  “known”  options  are  useable  in  the  dialog  box. 

TSA-property  regression  models  available  within  PEP  are:  Universal,  Universal  Nonionizable, 
Universal  Ionizable,  Alcohols,  Anilines,  Carbamates  halogenated  Aliphatics,  Nonhalogenated 
Aliphatic,  Halogenate  Aromatics,  Nonhalogenated  Aliphatic,  Halogenate  Aromatics,  PCBs,  PAHs, 
Phenols,  Triazine,  and  Ureas.  Not  all  models  are  available  for  each  property.  The  models  not 
available  for  the  properties  to  be  estimated  are  dimmed  in  the  popup  menu.  You  can  also  view  the 
statistical  information  associated  with  each  model  by  clicking  on  the  “eye”  next  to  the  regression 
equation. 

Estimating  the  Properties 

After  the  appropriate  TSA-property  relationships  have  been  selected  you  can  now  click  on  the 
“Estimate  Property”  button  to  calculate  the  estimated  properties.  The  estimated  properties  and  their 
respective  95%  prediction  interval  will  be  displayed  on  the  results  card.  Return  from  the  results 
card  by  clicking  on  the  “Return  arrow”  button  at  the  upper  right  hand  comer  of  the  results  card. 
Use  the  “Go”  menu  to  move  to  another  module.  View  values  in  the  Chemical  Property  Database 
by  clicking  on  the  “Look  in  DB”  button.  If  a  property  value  is  not  available  in  the  database,  an  NA 
will  be  displayed  in  the  property  field. 

Development  of  TSA-Property  Relationships 

The  PEP  TSA-Property  relationships  that  were  were  developed  using  stepwise  regression 
techniques  and  the  data  in  the  Chemical  Physical  Data  Base .  The  stepwise  regression  was  stopped 
when  the  coefficient  of  determination  did  not  improve  by  at  least  0.05  when  the  next  variable  was 
added. 

For  compounds  containing  polar  functional  groups,  the  addition  partial  TSA  terms  (i.e. 
nitrogen  (N-TSA),  oxygen  (O-TSA),  aromatic  nitrogen  (ArN-TSA),  sulfur  (S-TSA),  or 
phosphorous  (P-TSA))  significantly  improved  the  TSA-property  regression  models.  To  view  the 
regression  information  by  clicking  on  the  “eye”  next  to  the  regression  title  on  the  results  card  or  on 
the  TSA  card. 
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Limitations  of  TSA-Property  Relationships 

A  major  factor  in  the  solubilitzation  process  is  the  energy  required  to  create  a  cavity  in  the 
solvent  into  which  the  solute  is  placed.  The  energy  needed  for  the  hole  formation  is  considered  to 
be  proportional  to  the  surface  area  of  the  solute.  TSA  has  been  found  to  be  linearly  related  to  the 
logarithm  of  solubility  for  many  classes  of  non-ionizable  organic  chemicals. 

As  with  the  MCI  module,  selection  of  the  most  appropriate  TSA-property  relationship  depends 
on  the  structure  of  a  particular  chemical  of  interest,  knowledge  of  the  mechanism  of  the  process, 
and  the  extent  of  the  database  used  to  develop  the  TSA-property  relationship.  Some  TSA-property 
relationships  are  broader  than  others  in  the  range  of  chemicals  that  are  covered  and  some  have  been 
established  with  a  better  understanding  of  the  mechanisms  or  properties  involved. 
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UNIFAC  Module 


Overview 

The  UNIFAC  (UNIQUAC  Functional  Group  Activity  Coefficient)  group  contribution  method 
for  calculating  activity  coefficients,  as  described  by  Grain  (1990),  is  implemented  using  both 
HyperTalk  and  an  XFCN.  The  functional  group  interaction  parameters,  presented  by  Gmehling  et 
al.  (1982)  and  derived  from  vapor-liquid  equilibria  (VLE),  are  used  in  the  calculation  routine  but 
can  be  changed  by  the  user.  After  the  activity  coefficients  are  calculated  they  can  be  displayed 
along  with  lelevant  intermediate  values  and  used  to  estimate  S  and  Kow  by  the  following 
expressions  (Arbuckle,  1986): 


Kow  =  0. 1 15  y»w  / y»o  (2) 

S  (mol/L)  =  55.6  /  (3) 


where  ^°°w  is  the  activity  coefficient  of  the  chemical  infinitely  dilute  in  water  and  °  is  the 
activity  coefficient  of  the  chemical  infinitely  dilute  in  octanol. 

The  operation  of  the  UNIFAC  module,  illustrated  in  Figure  18,  is  considerably  different  than 
the  correlation  modules  previously  described.  To  use  the  UNIFAC  module  the  you  must  input 
structural  information,  calculate  the  activity  coefficients,  choose  then  choose  the  properties  to  be 
estimated.  Currently  the  only  properties  that  can  be  directly  estimated  using  the  UNIFAC  module 
are  S  and  Kq^. 
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Chemical  Name:  3.2',6.6'-tetrpchlorobipheniil _ 

SMILES  String: _c L(C|).cccc (C JJc 1 -c2 c(CI)cccc2(CI) 
UNIFAC  Groups:  4  ACCI  6  ACH  2  AC _ _ _ 


1. Input 
Structure 


2.  Calculate 

flctiuity 

Coefficient 

$ 

3.  Choose  Property 

0  s 

Koui 


_ 11 _ 

Edit  UNIFAC 

Parmetera 

Disclau  Act.  Cotff. 

4.  Estimate 
■*)  Properties 


Figure  18.  Example  card  from  the  PEP  UNIFAC  module. 


Entering  Structural  Information 

Calculation  of  activity  coefficients  via  the  UNIFAC  approach  requires  that  the  user  input  the 
valid  UNIFAC  groups  that  make  up  the  chemical  of  interest  and  the  number  of  each  group  present. 
PEP  provides  the  user  with  three  options  for  entering  the  appropriate  UNIFAC  groups  using  the 
popup  menu  under  the  button  titled  “Input  Structure”:  (1)  hand  selecting  the  groups  from  a  list  in 
HyperCard,  (2)  using  a  connection  table,  and  (3)  using  SMILES. 

If  the  connection  table  option  is  chosen,  a  the  standard  file  selection  dialog  box  is  presented 
where  the  user  can  select  the  desired  connection  table  file.  The  "Struct"  XFCN  then  uses  the 
connection  information  to  dissect  the  molecule  into  its  UNIFAC  groups. 

If  SMILES  is  chosen  as  the  input  method,  the  user  is  then  prompted  to  input  the  chemical  name 
and  the  SMILES  string.  The  Struct  XFCN  uses  the  connection  information  contained  in  the 
SMILES  string  to  build  the  UNIFAC  input  string. 
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When  the  user  chooses  to  hand  select  the  UNIFAC  groups,  a  curd  is  displayed  showing  the 
First  37  groups  as  shown  in  Figure  19. 
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Figure  19.  UNIFAC  module  card  used  to  select  UNIFAC  groups. 


This  card  is  connected  to  two  other  similar  cards  showing  the  rest  of  the  available  groups.  To 
select  a  group,  the  user  clicks  on  the  group  symbol  and  a  dialog  box  will  then  appear  to  enable  the 
user  to  input  the  number  of  this  group  that  is  in  the  molecule.  The  user  then  continues  to  select 
groups  until  all  the  groups  that  are  in  the  molecule  have  been  selected.  When  the  user  clicks  the 
“Done”  button  the  UNIFAC  input  string  is  built  and  returned  to  the  UNIFAC  card.  In  addition, 
for  users  familiar  with  the  UNIFAC  approach,  the  appropriate  subgroups  can  also  be  entered 
directly  by  simply  typing  the  number  of  a  group  followed  by  a  space  then  the  symbol  of  the  group 
for  each  group  in  the  chemical. 

The  final  form  of  the  input  is  “#  group  #  group  ....”.  The  #  represents  the  number  of  the 
functional  group  in  the  molecule,  the  group  is  the  group  symbol  of  the  functional  group.  For 
example  the  UNIFAC  input  string  for  Toluene  is  “5  ACH  1  ACCH3”,  meaning  Five  aromatic- 
carbons  with  one  hydrogen  atom  (5  ACH)  and  one  aromatic  carbon  connected  to  a  methyl  group  (1 
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ACCH3).  The  UNIFAC  groups  available  to  PEP  are  shown  in  Table  2.  Remember  UNIFAC 
subgroups  may  not  be  available  for  every  compound.  In  these  cases  the  activity  coefficient  can  not 
be  calculated  using  the  UNIFAC. 


Table  2.  UNIFAC  groups 


Group 

Name 

Group 

Symbol 

Group 

Name 

Group 

Symbol 

CH2 

CH3 

CCN 

CH3CN 

Cl  12 

CH2CN 

CH 

COOH 

COOH 

C=C 

CH2=CH 

HCOOH 

CH=CH 

CC1 

CH2C1 

CH2=C 

CHC1 

CH=C 

CC1 

ACH 

ACH 

CC12 

CH2C12 

AC 

CHC12 

ACCH2 

ACCH3 

CC13 

CHC13 

ACCH2 

CC13 

ACCH 

CC12 

OH 

OH 

CC14 

CC14 

CH30H 

CH30H 

ACC1 

ACC1 

H20 

H20 

CN02 

CH3N02 

ACOH 

ACOH 

CH2N02 

CH2CO 

CH3CO 

CHN02 

CH2CO 

ACN02 

ACN02 

CHO 

CHO 

CS2 

CS2 

CCOO 

CH3CO 

CH3SH 

CH3SH 

CH2COO 

CH2SH 

HCOO 

HCOO 

Furfural 

Furfural 

CH20 

CH30 

DOH 

(CH20H)2 

CH20 

I 

I 

CH-0 

Br 

Br 

FCH20 

C=C 

CH=C 

CNII2 

CU~NH2 

C-C 

Cii2NH2 

DMSO 

DMSO 

CHNH2 

ACRY 

ACRY 

CNH 

CH3NH 

C1CC 

C1-(C=C) 

CH2NH 

ACF 

ACF 

CHNH 

DMF 

DMF-1 

(C)3N 

CH3N 

DMF-2 

CH2N 

CF2 

CF3 

ACHN2 

ACHN2 

CF2 

Pyridine 

C5H5N 

CF 

C5H4N 

C5H3N 

Calculate  Activity  Coefficients 


After  the  functional  groups  are  chosen,  the  activity  coefficients  can  be  calculated  using  the 
procedure  described  by  Grain  (1990)  by  clicking  on  the  “Calc.  Activity  Coefficients”  button.  Once 
the  activity  coefficients  have  been  calculated  they  can  be  displayed,  along  with  values  from  several 
intermediate  steps,  by  clicking  on  the  “Display  Act.  Coeff.”  button.  The  equations  used  to 
calculate  the  activity  coefficients  can  also  be  displayed  by  clicking  the  “balloon”  on  the  “UNIFAC 
Calculations”  card. 

Editing  Parameters 

The  UNIFAC  group  values  can  be  edited  by  clicking  on  the  “Edit  Parameters”  button.  This 
will  take  you  to  an  Index  card  containing  a  button  for  each  group.  To  edit  the  value  for  that  group, 
click  on  the  corresponding  group  button.  To  edit  the  Q  and  R  values,  click  on  the  left  arrow  at  the 
bottom  of  the  index  card. 

Estimating  Properties 

To  estimate  S  and/or  Kow  click  on  the  “Estimate  Properties”  button.  Clicking  the  “eye”  next  to 
the  property  will  display  the  equations  used  to  estimate  S  or  Kow.  The  results  will  be  reported  on 
the  results  card. 

Limitations  of  UNIFAC  Approach  to  Estimating  S  and  Kow 

The  use  of  UNIFAC  is  limited  to  chemicals  that  have  subgroups  contained  in  a  UNIFAC  data 
base.  If  a  chemical  has  subgroups  for  which  no  UNIFAC  values  are  available,  the  activity 
coefficients  cannot  be  calculated  and,  therefore,  the  properties  cannot  be  estimated.  In  addition,  it 
has  been  observed  that  the  errors  associated  with  estimating  S  and  KoW  via  the  UNIFAC  approach 
tend  to  increase  as  the  compound  becomes  less  solubility  (larger  Kow).  Correction  factors  have 
been  presented  in  the  literature  to  correct  for  this  tendency. 
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Property/Property  Module 

Overview 

Quantitative  Property-Property  Relationships  (QPPRs),  based  on  the  relationship  between  two 
properties  as  determined  by  regression  analysis,  are  used  to  predict  the  property  of  interest  from 
another  more  easily  obtained  property  without  a  specific  concern  for  molecular  structure. 
Frequently,  the  regression  expressions  are  expressed  in  terms  of  the  log  of  the  two  properties. 

The  operation  of  the  Property-Property  correlation  module,  illustrated  in  Figure  20,  is 
considerably  different  than  the  other  modules  because  the  choice  of  the  regression  models  used  for 
the  property  estimation  determines  the  required  inputs. 
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Figure  20.  Example  card  from  PEP  property-property  module. 
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Selecting  the  Properties  to  be  Estimated 


Since  the  input  for  the  Property/Property  module  depends  on  the  properties  to  be  estimated  and 
the  regression  models  used,  the  user  must  first  select  the  properties  to  be  estimated  and  the 
regression  equations  to  be  used  before  any  input  values  are  requested.  The  program  keeps  track  of 
the  properties  required  and  provides  the  appropriate  input  fields.  If  available,  the  required 
properties  can  be  imported  directly  from  the  associated  chemical  property  database.  Currently 
property-property  relationships  for  the  estimation  of  S,  Kow,  Koc,  and  BCF  are  available  and  will 
darken  as  the  properties  are  selected. 

Choosing  the  Property-Property  Regression  Model 

The  most  appropriate  model  to  use  in  estimating  the  property  of  interest  depends  on  the  class  of 
the  chemical  and  what  property-property  relationships  are  available.  The  property-property 
relationships  that  are  available  in  PEP  are  divided  into  two  general  categories;  regressions 
developed  using  data  from  the  PEP  Chemical  Physical  Data  Base  and  regressions  obtained  from 
the  literature.  The  two  categories  are  separated  by  a  grey  line  in  the  popup  menu.  To  choose  the 
property-property  model,  use  the  popup  menu  directly  under  the  current  regression  by  holding  the 
mouse  button  down  while  over  the  title  of  the  current  regression.  The  “eye”  icon  to  the  right  of  the 
regression  title  will  display  the  regression  equation  and  statistics  associated  with  the  current 
relationship.  The  “book”  icon  will  show  the  reference  for  the  current  regression  if  the  regression  is 
from  the  literature. 

Viewing  the  Calculated  Values 

Once  the  properties  and  the  relationships  are  chosen,  the  properties  required  for  input  need  to 
be  entered.  The  properties  that  are  required  are  shown  on  the  right  side  of  the  card  with  a  line  next 
to  the  symbol.  The  values  can  be  retrieved  from  the  PEP  Chemical  Property  Database  by  making 
sure  the  property  name  is  correct  and  clicking  on  the  “Look  in  Prop.  DB”  button.  The  current  units 
of  the  inputs  will  be  shown  if  appropriate,  and  will  be  changed  as  needed  by  PEP  for  the  different 
regression  equations. 

When  the  “Estimate  Properties  button  is  clicked  the  properties  that  were  input  will  be  used 
along  with  the  chosen  regression  equations  to  estimate  the  selected  properties.  If  the  values  for  the 
input  properties  have  not  been  entered  on  the  Property-Property  card  the  user  will  be  prompted  to 
supply  the  required  values.  The  property  estimates  along  with  the  calculated  95%  prediction 
interval,  if  the  statistical  information  is  available,  will  be  placed  on  the  Results  card. 
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Limitations  of  the  Property/Property  Module 


One  major  limitation  to  the  use  of  property/property  relationships  is  the  lack  of  suitable 
property  values  required  for  input  into  the  regression  models.  For  many  new  chemicals,  no 
physical/chemical  property  information  is  available.  In  addition,  the  quality  of  the  property- 
property  regression  model  depends  on  the  extent  of  the  database  used  to  develop  the  relationship. 
For  example,  some  property- property  models  are  broader  than  others  in  the  type  and  number  of 
compounds  used  in  its  development. 
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PEP  BATCH 


Overview 

PEP  Batch  provides  users  with  a  method  for  the  convenient,  unattended  calculation  of  MCIs, 
TSA  and  UNIFAC  activity  coefficients  and  subsequent  estimation  of  physical  properties  for  large 
numbers  of  compounds  via  the  PEP  processor  described  earlier.  Like  the  PEP  Processor,  PEP 
Batch  is  divided  into  MCI,  TSA  and  UNIFAC  modules.  Each  module,  as  illustrated  in  Figure  21 
for  the  MCI  module,  requires  the  user  to  select  the  appropriate  input  file  (i.e.  SMILES  string, 
connection  table  or  3-D  atomic  coordinates),  choose  to  information  to  be  sent  to  the  output  file  (i.e. 
chemical  name,  SMILEs  string,  properties)  and  start  the  batch  driver. 


PEP  Batch 


*  File  Edit  Go  Print  Batch 


MCI  Batch 


2.  Select  Output 


E!  Chemical  Names  from  file 
£3  Smiles  string 
£5  MCIs 

El  s 

IS  Korn 
IS  Pu 

EJH 
El  Koc 

ts  bcf 

IS  Include  Regressions  Used 
IS  Rest  of  the  Input  file 


Figure  21.  Example  card  for  PEP  Batch,  MCI  module. 


38 


Input  Structure 


The  MCI  and  UNEFAC  modules  require  the  two-dimensional  molecular  structure  of  the 
chemical  to  be  entered  using  either  SMILES  strings  or  connection  tables. 

To  enter  SMILES  strings  into  the  MCI  or  UNIFAC  batch  modules,  you  muse  create  a  text  file 
containing  name  of  the  chemical  and  its  corresponding  SMILES  strings  in  columns  separated  by 
tabs  and  hard  returns  at  the  end  of  each  row.  The  text  file  can  contain  additional  tab-delimited 
information,  but  the  SMILES  strings  and  chemical  names  must  be  in  the  first  or  second  column. 
When  you  select  the  SMILES  input  option,  you  must  indicate  the  column  order  on  the  “Explain 
SMILES  file  card  shown  in  Figure  22.  The  text  file  can  be  created  with  word  processing  or 
spreadsheet  programs  or  you  can  also  edit  or  create  a  file  containing  the  SMILES  strings  and 
chemical  names  within  PEP  by  selecting  SMILES  from  the  “input  structure  type”  pulldown  menu. 


4  File  Edit  Go  Print  Batch  rft  (2)  O 

ExDlain  S 

MILES  file 

C  Yiev  File  Contents  ) 

benzene  clcccccl 
toluene  c1cc(C)ccc1 
o-xylene  c1c(C)c(C)ccc1 
ethy  lbenzene  c  1  c(CC)cccc  1 

51 

1 

52 

Column  1 _ Column  2 _ 

(e)  Chemical  Name  O  Chemical  Name 
O  SMILES  ^SMILES 


□  Ignore  First  Line 


Figure  22.  PEP  Batch,  explain  SMILES  file  card. 
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To  input  connection  table  files  into  the  MCI  or  UN1FAC  batch  modules,  you  must  first  place 
them  in  a  single  folder.  Select  the  “Connection  tables”  option  from  the  “Select  Input  Type”  popup 
button  and  choose  the  folder  that  contains  the  connection  table  files  using  the  standard  “open  file” 
dialog  box  that  appears.  Highlighting  any  one  of  the  files  in  the  folder  selects  all  of  the  files  in  that 
folder  and  allows  you  to  view  or  delete  specific  files.  The  files  can  be  any  valid  type  for  the  MCI 
module  as  they  will  be  converted  if  possible  (see  MCI  module  helps).  After  selecting  the  input 
files,  click  on  the  advance  arrow  located  at  the  upper  right  of  the  card  to  advance  to  the  next  step. 

The  TSA  batch  module  operates  in  the  same  manner  as  the  MCI  and  UNIFAC  modules  except 
that  the  calculation  of  TSA  requires  the  three-dimensional  cartesian  coordinates  for  each  atom  in  the 
chemical  of  interest.  The  TSA  batch  module  accepts  Cartesian  Coordinates  or  Alchemy  files. 
Alchemy  files  contain  both  the  two-dimensional  chemical  structure  and  the  coordinates.  This 
allows  PEP  to  calculate  the  chemical’s  TSA  and  determine  the  most  appropriate  TSA-property 
relationship  based  on  chemical  class.  From  within  the  standard  dialog  box,  you  can  click  on  any 
file  in  a  folder  to  select  all  of  the  files  in  that  folder.  The  files  will  then  be  displayed.  You  can 
also  view  or  delete  files. 

Output  Options 

Once  the  input  files  have  been  selected,  the  “output  option”  step  becomes  active.  This  allows 
the  user  to  select  the  properties  to  be  calculated  and  any  additional  information  that  is  available  (i.e. 
chemical  name,  SMILES  string,  MCIs,  TSA,  or  UNIFAC  activity  coefficients)  to  be  exported  by 
to  a  tab-delimited  text  file. 

Start  Batch  Driver 

After  the  output  information  is  selected,  the  “Start  Batch  Driver”  button  becomes  active. 
Clicking  this  button  brings  up  the  standard  Macintosh  “save  file”  dialog  box  that  allows  the  user  to 
specify  the  name  of  the  data  file  to  be  exported  by  PEP  batch  and  the  location  that  the  file  will  be 
saved. 
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CHEMICAL  PROPERTY  DATABASE 

Overview 

Experimentally  determined  physical  property  data  for  about  800  compounds,  having  at  least 
one  value  of  aqueous  solubility  (S),  octanol/water  partition  coefficient  (Kow),  vapor  pressure 
(Pv),  organic  carbon  normalized  soil  sorption  coefficient  (Koc),  bioconcentration  factor  (BCF),  or 
Henry's  law  constant  (H),  was  complied  from  a  variety  of  literature  sources  and  computerized 
databases.  Using  this  information,  a  chemical  property  database  was  constructed  using 
HyperCard™  and  subsequently  used  for  developing  MCI-property,  TSA-property  and  property- 
property  relationships.  In  addition  to  the  properties  listed  above,  the  database  includes  the 
following  information:  compound  name  and  synonyms,  a  diagram  of  the  2-D  chemical  structure, 
SMILES  notation,  uses,  CAS  number,  chemical  formula,  molecular  weight  (MW),  boiling  point 
(BP),  melting  point  (MP),  and  appropriate  references  for  each  value.  A  built-in  unit  conversion 
utility  enables  users  to  quickly  view  property  values  in  a  variety  of  commonly  used  units.  The 
database  is  directly  connected  to  the  PEP  Processor  stack. 

The  Chemical  Property  Database  also  provides  the  means  for  the  user  to  search  for  chemical 
compounds  by  full  or  partial  name  or  synonym,  to  sort  the  compounds  by  name,  boiling  point, 
melting  point,  or  molecular  weight,  and  the  ability  to  transfer  to  any  of  the  property  estimation 
modules.  In  addition,  the  user  can  easily  edit  exiting  values,  add  new  values  or  export  information 
to  a  text  file  or  another  database. 

In  addition  to  the  physical  properties,  information  describing  the  environmental  persistence  and 
toxicity  of  specific  chemicals  can  also  be  entered  into  the  database.  Placeholders  for 
biodegradation  rates,  hydrolysis  rates,  photolysis  rates  and  LC50s,  along  with  the  appropriate 
references  and  comments  have  been  incorporated  into  the  database.  This  feature  was  added  to  the 
database  after  requests  from  test  users,  however  currently  no  degradation  or  toxicity  data  has  been 
entered  into  the  database.  The  chemical  property  database  is  illustrated  in  Figures  23  and  24. 
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Figure  23  Example  card  from  the  Chemical  Property  Data  Base 


chemical  property  data  base 


4  File  Edit  Go  Find  Sort  (Jalues  Print  Enport 


2.2'.6.6'-tetrachlorobiphenyl 


Uieu>:|  structure 


oHo 


Cl  Cl 


Chemical\Physical  Properties 


Degradation 

Hydrolysis 

Photolysis 

Biodegradation 


t  1/2 


t  1/2 


Matrix 


Matrix 


Temp  CC) 


Temp  CC) 


Comments 


Comments 


Figure  24.  PEP  database  degradation  properties. 
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Searching  for  Chemical  Compounds  in  PEP’s  Database 


Chemicals  in  the  database  can  be  found  by  full  chemical  name,  partial  chemical  name, 
synonyms,  or  CAS  number  searches,  or  by  selecting  the  chemical  from  a  current  list  of 
compounds.  The  user  selects  the  desired  option  from  the  “Find”  menu. 

Searching  or  Finding  by  either  chemical  name,  partial  chemical  name,  synonyms  or  CAS 
number  is  done  using  the  finding  tools  built  into  HyperCard.  When  the  finding  option  is  chosen  a 
standard  dialog  box  is  used  to  input  the  string  that  is  to  be  found.  Finding  by  full  chemical  name 
or  CAS  number  requires  that  the  entire  contents  of  the  appropriate  field  match  the  string  of 
characters  that  was  entered.  Finding  by  partial  chemical  name  or  synonym  requires  that  all  of  the 
string  that  was  entered  be  present  in  the  appropriate  field. 

The  select  from  list  option  allows  the  user  to  select  a  compound  from  the  current  list  of 
chemical  names.  The  chemical  names  appear  alphabetically  sorted  in  a  dialog  box.  Typing  the  first 
letter  or  number  of  the  chemical  will  scroll  the  list  to  the  first  chemical  starting  with  that  character. 
The  chemical  card  is  located  and  displayed  by  clicking  twice  on  the  same  chemical  name  or 
selecting  the  chemical  name  and  then  clicking  the  “Go  There”  button. 

Sorting  the  Database 

The  “Sort”  menu  at  the  top  of  the  card  allows  you  to  sort  the  database  by  name,  melting  point 
(MP),  boiling  point  (BP),  or  molecular  weight  (MW).  NOTE:  the  alphabetic  sort  function  built 
into  HyperCard  places  all  chemicals  with  numbers  preceding  the  letters  in  the  chemical  name  before 
those  chemicals  having  only  letters  in  the  chemical  name.  For  example,  after  sorting  by  name,  1- 
octanol  would  come  before  acetone  and  2-octanol,  but  after  1-decanol. 

Adding  or  Deleting  Data 

New  chemicals  may  be  added  to  the  data  base  by  choosing  “Add  new  card”  option  from  the 
Values  menu.  The  user  will  then  be  prompted  for  the  chemical  name,  CAS  number,  molecular 
formula,  melting  point,  boiling  point,  molecular  weight,  and  the  SMILES  string. 

Additional  values  for  melting  point,  boiling  point,  S,  K^,  Pv,  H,  K^,  BCF,  and  Ka  can  be 
added  to  a  chemical  already  in  the  database  using  the  “Add  value”  option  accessed  from  the  Value 
menu.  As  each  new  value  is  entered  the  user  is  prompted  for  the  reference.  The  reference  can  be 
an  existing  reference  or  the  user  can  add  a  new  reference.  If  a  new  reference  is  to  be  added,  the 
user  will  be  prompted  for  the  author,  year,  title,  publication,  volume,  issue,  and  the  page  numbers. 
The  user  will  also  be  given  a  chance  to  edit  the  new  reference. 
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An  entire  card  or  specific  values  contained  on  a  card  can  be  deleted  by  choosing  the  appropriate 
item  from  the  Value  menu.  A  reference  can  also  be  deleted  if  it  was  entered  incorrectly.  The  data 
will  not  be  deleted  until  the  user  verifies  the  request  through  a  standard  dialog  box. 

Printing  Information  From  the  Database 

The  user  can  print  the  current  card,  or  a  series  of  cards  using  the  print  features  built  into 
HyperCard.  To  print  a  series  of  cards,  first  the  user  must  select  that  option  from  the  “Print”  menu. 
Then  while  each  card  that  is  to  be  printed  is  the  currently  displayed  card,  the  select  “Print  this  card” 
option  from  the  print  menu  must  be  chosen.  When  the  desired  cards  have  been  printed,  the  user 
then  selects  “Stop  printing”  from  the  “Print”  menu.  The  user  can  also  print  a  list  of  all  the 
references  contained  in  the  PEP  data  base. 

Exporting  Information  From  the  Database 

Information  in  the  database  can  either  be  exported  as  text  in  a  line  or  as  a  spreadsheet  format 
using  the  “Export”  menu.  If  the  line  format  option  is  used,  the  data  will  appear  with  each  data  field 
on  a  line  and  each  chemical  separated  by  a  solid  line.  This  option  is  most  appropriate  if  a  word 
processor  will  be  used  to  print  or  manipulate  the  exported  data.  If  the  spreadsheet  format  is  used 
the  data  will  be  put  in  columns  delimited  by  a  tab. 

Moving  to  Other  PEP  Modules 

The  “Go”  menu  at  the  top  of  the  card  allows  you  to  move  to  any  PEP  module.  To  use  the  Go 
option,  move  the  pointer  to  Go  menu  at  the  top  of  the  card.  Hold  the  pointer  down  on  the  button 
until  the  selections  appear.  Then  move  the  pointer  until  the  selection  of  choice  is  highlighted  and 
release  the  mouse  button. 

Changing  the  Units  of  Measurement 

The  popup  menu  under  each  property’s  units  allows  you  to  change  the  units  for  solubility, 
Henry’s  law,  and  vapor  pressure.  All  values  of  the  property  will  be  changed  to  reflect  the  new 
units. 
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PEP  Models 


Overview 

To  illustrate  the  practical  application  of  PEP,  an  additional  stack  called  PEP  Models  was 
developed.  This  stack,  which  contains  the  algorithms  for  the  Level  1  and  2  Fugacity  Models 
(Mackay,  1980),  is  linked  directly  to  the  the  PEP  Processor,  but  can  also  be  used  independently. 
The  Level  1  Fugacity  Model  considers  a  unit  world  consisting  of  six  compartments:  air,  water, 
soil,  suspended  solids,  sediment,  and  biota  (as  shown  in  Figure  25). 


Figure  25.  Representation  of  Fugacity  Level  1  compartments. 


The  model  predicts  the  equilibrium  concentrations  of  the  chemical  of  interest  in  each 
compartment  using  the  fugacity  approach  described  by  Mackay  (1980).  The  model  requires  the 
input  of  Kqq,  H  and  BCF  which  can  be  read  directly  from  the  PEP  processor  or  the  PEP  chemical 
property  database,  if  available.  In  addition  to  the  chemical  specific  properties,  the  density  and 
volume  of  each  compartment  must  be  specified  along  with  the  organic  carbon  content  of  the  soil, 
sediment  and  suspended  sediment.  An  editable  set  of  default  values  for  compartment  density, 
volume  and  organic  carbon  content,  as  shown  in  Figure  26. 
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PEP  Models 
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Figure  26.  PEP  Models  card,  input  for  Fugacity  Level  1 

The  Level  2  Fugacity  Model  allows  for  the  chemical  of  interest  to  degrade  in  each 
compartment,  move  by  advection  through  the  water  and  air  phases,  and  be  emitted  into  the  unit 
world.  The  rate  values  for  each  of  these  processes  must  be  entered  by  the  user.  The  degradation 
rates  for  each  compartment  can  be  entered  either  by  1 1/2  values  in  hours  or  by  first  order  reaction 
rate  constants  in  1/hours,  as  shown  in  Figure  27.  The  advection  rate  data  can  be  entered  either  by 
residence  time  or  flow  rate  and  the  concentration  or  by  directly  entering  the  mass  flow  rate  in  moles 
per  hour.  The  emission  rate  is  entered  in  the  units  of  moles  per  hour. 


4o 


Fugacity  Level  2 

Chemical  Name:  2,2’,6,6’-tetrachlorobiphenyl 


E3  Air 
E  Water 
E  Soil 

El  Susp.  Solids 
E3  Sediment 
S  Biota 


3.  Input  Reaction  Rate  Data 

(i  nput  reaction  1  /2  life  or 
reaction  rate  constant) 


Reaction 
1/2  life 
hours 


OR 


Reaction 
rate  const, 
hours"’ 


4.  Input  Advective  Data 

(input  mass  flow  rate  or  cone,  and 
residence  time  or  flow  rate) 

Residence  Flow 
time  OR  Rate  Cone, 
hours  m3/hr  mol/m3 


Mass 
flow  rate 
mol /hr 
O.OE+OO 
0  OE+OO 


5.  Input  Emission  Rate 

Emission  rate  mol/hr 


Input  Leuel 
3  Data 


OR 


6.  Calculate 
Distribution 


Figure  27.  PEP  Models  card,  input  for  Fugacity  Level  2 
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Input  Property  Values 

To  calculate  the  environmental  distribution  of  a  chemical  the  user  must  enter  the  several 
physical-chemical  properties  along  with  the  size  and  densities  of  each  compartment.  The  physical- 
chemical  properties  that  are  need  to  be  input  for  the  six  compartment  unit  world  are  Koc,  H,  and 
BCF.  However,  in  step  2  the  user  may  choose  to  eliminate  one  or  more  compartments  to  simplify 
the  model  cr  to  more  closely  examine  the  distribution  of  a  chemical  in  a  sub-environment  such  as 
the  hypolimnion  of  a  lake  (i.e.,  water,  air,  biota).  This  would  eliminate  the  need  for  a  value  for 
Koc.  PEP  determines  which  physical-chemical  property  values  are  needed  depending  on  the  unit 
world  which  the  user  has  defined.  The  chemical-physical  properties  that  are  not  needed  will  be 
dimmed. 


When  available,  experimental  values  are  preferred  over  estimated  values  in  environmental  fate 
modeling.  Experimental  values  for  the  physical-chemical  properties  can  be  searched  for  in  the 
Chemical  Property  Data  Base  by  using  the  “Look  for  Values  in  Prop.DB”  option.  When  clicked,  a 


47 


full  chemical  name  search  is  initiated,  and  if  the  chemical  is  found,  the  values  of  the  three 
properties  required  by  the  Fugacity  level  1  model  will  be  displayed  in  a  dialog  box.  The  values  can 
be  copied  to  the  Fugacity  card  using  the  buttons  on  the  dialog  box  as  illustrated  in  Figure  28.  The 
check  boxes  next  to  the  property  names  of  the  Fugacity  card  indicate  if  the  property  came  from  the 
data  base. 


Values  from  the  PEP  Property  Database 
Select  the  properties  to  copy  to  the  Model 


Log  Koc  :  1.780 

O 

Cancel 

Log  H  :  -0.648 

Log  BCF  ;  Not  Found 

Copy  All 

5 

([Copy  Selected  | 

Figure  28.  Example  dialog  box  resulting  from  the  “Look  for  Values  in  Prop.DB”  option 


Input  Environmental  Values 

The  values  used  for  the  unit  world  can  be  changed  in  step  2.  The  available  compartments  are 
air,  water,  soil,  suspended  solids,  sediment,  and  biota.  Water  must  always  be  part  of  the  unit 
world  because  of  the  partition  coefficients  are  defined  as  the  ratio  of  the  chemical  in  the  non- 
aqueous  phase  to  the  aqueous  phase.  The  rest  of  the  compartments  can  be  eliminated  by  clicking 
the  check  box  next  the  compartment  name.  The  densities,  volumes,  and  the  percent  organic  carbon 
(if  appropriate)  of  each  compartment  can  be  changed. 

Different  user  defined  environments  or  unit  worlds  can  also  be  saved  using  the  “Save  Current 
Values”  button  and  later  recalled  using  the  “Values:”  popup  menu.  The  “Delete  Saved  Values” 
button  can  be  used  to  remove  a  saved  set  of  values  from  the  list  that  will  popup  when  the  “Values” 
button  is  used.  The  “Set  Default  Values”  button  can  be  used  to  set  the  unit  world  that  is  used  when 
the  PEP  Models  stack  is  opened. 

Calculate  Distribution 

After  the  user  inputs  all  the  required  information  and  hits  the  “calculate  distribution”  button,  the 
model  calculations  are  performed  in  HyperTalk  and  the  results  are  presented  in  both  in  tabular  and 
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graphical  form  as  illustrated  in  Figure  29.  The  graphical  display  can  be  changed  from  bar 
(concentration  of  the  chemical  in  each  phase)  to  pie  (percent  of  the  chemical  in  each  phase)  chan 
forms  using  the  “Graph:”  popup  menu.  The  values  of  the  distribution  coefficients  that  were  used 
in  the  calculations  are  also  shown  on  the  results  card. 
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Figure  29.  PEP  Models  results  card 
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PEP  Help 

Information  regarding  the  operation  of  the  chemical  property  database  and  the  property 
estimation,  models  and  batch  modules  is  available  in  the  PEP  Help  stack.  This  stack  easily 
accessed  at  any  time  within  the  PEP  system.  The  organization  and  layout  of  each  help  card  is 
similar  to  that  illustrated  in  Figure  30  for  the  MCI  module.  The  user  can  select  the  topic  of  interest 
by  clicking  on  the  appropriate  radio  button  and  the  information  on  that  subject  will  be  displayed  in 
the  scrolling  field. 


File  Edit  Go  Print 


MCI  Options 

<S)  oueruietu 

O  input  structure 

O  calculate  MCls 

O  choose  properties  S' 
regression  equations 

O  estimate  properties 

O  limitations 


PEP  Help 


MCI  Module 


Molecular  connectivity  developed  by  Randic 
( 1 972)  and  refined  and  expanded  by  (Kier  and  Hall, 
1976,  1980,  1986)  is  a  method  of  bond  counting 
from  which  topological  indexes,  based  on  the 
structure  of  the  compound,  can  be  derived.  For  a 
given  molecular  structure,  several  types  and 
orders  of  molecular  connectivity  indexes  (MCls) 
can  be  calculated.  Information  on  the  molecular 
size,  branching,  cyclization,  unsaturation,  and 
heteroatom  content  of  a  molecule  is  encoded  in 
these  various  indices  (Kier  and  Hall,  1976). 
Molecular  connectivity  has  been  used  to  predict 
Koc  (Sabljic,  1984,  Sabljic,  1987,  Bahnick  and 
Doucette,  1988),  S  (Doucette,  1985, 
Nirmalakhandan  and  Speece,  1988a),  Kow  (Doucette 


Figure  30.  Example  card  from  PEP  Help  stack  for  MCI  module. 
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APPENDIX  B 

STATISTICAL  SUMMARY  FOR  MCI-PROPERTY, 
TSA-PROPERTY  AND  PROPERTY-PROPERTY 
RELATIONSHIPS  INCORPORATED  INTO  PEP 


53 


File  Edit  Go  Print  Misc. 


STATISTICS  Class:  S  Universal 


_ Regression  Results 

Sid 

V«rt«b!e  C*ef.  Errtr  * 


Constat* 

0  3917 

o  tare  iss  O 

vpl 

025  7 

0  0316  -29  3 

Avpl 

1  8251 

0  1047  17  4 

MP-25 

-0  01 

<> 

Analysts  of  Variance  Table 
S#»r«*  RSS  df  MSS  F 


Recession 

888  1  78 

2 

445  446 

Residual 

360  820 

362 

0  997 

Total 

1 250  086 

364 

3  4343 

r2=  7 1. 1  % 

•w  = 

365 

S  =  0  0985 

Predicted  vs.  Exp. 


XJ 


Residual  vs  Predicted 


■7.5  -2.5  00  2.3 

*xp*rim*rt<]  to^G 


Residual  vs  Prob 


■ » 3  0  0  1 .5 

number  of  standard 
deviations 


me  Idil  Uo  Print  Misc. 


STATISTICS  Class:  spcbs 


_ Regression  Results _ 

Sid 

Variable  Ceef  trrar  t 


Carwrar* 

1  8528 

0  2204 

8  41 

<> 

bp6 

6  2017 

0  3654 

1  7 

M  P  26 

-0  Ot 

o 

Analysis  of  Variance  Table 

area  RSS  d  <  MSS  I 


Saaree  RSS 

Rf<)ip»tn)ii  8  50320  i 

Residual  0  158026  7 

fuUi  e  »«}??«  « 


32  ITS  a 

?6 

ILL _ 


tog  S  vs  MCI 


lOj  r  -  87  6  (Wi  -  6  6"  0  1*03 

Residual  vs  Predicted  Residual  vs  Prob 


|  o’  ;• _ 

&  oi 


!  oi  | 

a  Q  ’  f 


04  06 

**> 


-6  00  4  50 


•0  75  0  73 

fwntwr  of  standard 

d*  ruiwi] 


File  Edit  Go  Print  Misc. 


STATISTICS  Class:  s  urea* 


File  Edit  Go  Print  Misc. 


STATISTICS  Class:  S  Anilines 


Regression  Results 


Variable  Caef.  Errar  t 


Conbtant 

3.758 

0.4631 

8'S  O 

bpl 

-1  .0747 

0  0808 

-13  3 

MP-25 

-0.  01 

o 

Analysis  of  Variance  Table 

Saaree  RSS  df  MSS  f 


Regression  Results 


Vanabit  Caef  Errar  t 


Analysis  of  Variance  Table 

Saarca  RSS  df  MSS  f 


Repress  io*t 

32  5444  1 

32  54 

177  Constant 

0  0882  0  401  2  47  <> 

Rpqfesaion 

45  5668  1 

45  57  06  0 

Residual 

1  65604  9 

0  1  84 

bpl 

-  7157  0  0734  9  75 

Residua? 

3  10898  19 

0  4  784 

Total 

34.20044  1  0 

3  42 

MP-25 

-0  01 

Toi  at 

54  67470  20 

2  7337 

Log  S  vs.  MCI 


r2=95  2%  Ha*5  11  s=  0.4280 

Residual  vs  Predicted  Residual  vs.  Prob 


Log  S  vs.  MCI 


r*=03  3Si  =  21  S-  0  6924 

Residual  vs  Predicted  Residual  vs  Prob 


«  13- 
|  03 
£  -03  -  ^ 


2.50  500  750 

bpl 


-i  o  o.o 

predicts 


File  Edit  Go  Print  Misc. 


STATISTICS  Class:  S  Alcohol* 


-i  o  1 
nunber  of  standard 
deviations 


-3  00  -7  50 

predicted 


*  File  Edit  Go  Print  Misc. 


STATISTICS  Class:  s  pah» 


Regression  Results 


SM. 

V«ri*kl«  Ceef  1  rr.r  « 


Constant 

2  85 24 

0  0839 

34  O 

bpl 

•1  1440 

0.0201 

-57 

MP-25 

0.01 

2 

Analysis  of  Vertence  Table 
Sore*  RSS  d,  MSS  F 


Log  S  VS.  fn-i 


Rasiduol  vs.  Predicii  :  Restduei  vs  P;oE 


Regression  Results 


Variell.  Ceaf.  Errsr  1 


Regression 

167  048 

1 

168  3248 

Residual 

3  41245 

66 

0.0517 

Total 

1 71  3604 

67 

2  6576 

r2=  98.0% 

= 

60 

S=  0.22  74 

Constant 

-1 .1353 

0  2894  3  02  O 

npl 

501  1 

0  0373  -134  1 

MP-25 

0  01 

o 

Analysts  ol  Varlence  Table 

Secret  RSS  d»  MSS  f 


log  S  H"! 


Regression 

30.5352 

i 

30  54 

1  8  i 

Residual 

5  74828 

3* 

0  1691 

Total 

36  28346 

[35 

1  0367 

r2=  84  2*. 

n»b*  = 

36 

8  = 

0  4112 

Residual  «s  Predicted  Residual  vs  Prob 


1 05 :: 

2-05  • 


-1.25  123 

number  of  standard 
deviations 


File  Edit  Go  Print  Misc. 


STATISTICS  Class:  Pv  Universal 


6  8  10  12 

npl 


-7  ~6  -5 

predicted 


-1  ^  1  75 

nt grim  of  standard 
deviations 


File  Edit  Go  Print  Misc. 


STATISTICS  Class:  Pv  pcbo 


File  fcdit  Uo  Print  Misc. 


STATISTICS  Class;  Koc  Universal 


Ui 


me  tdit  Go  Print  Mist. 


21  V 


STATISTICS  Class:  Koc  ure 


as 


Regression  Results 


VtMabla 

Caaf 

std 

Errar 

t 

Constant 

Q  6111 

0  176  7 

3  46 

npl 

0  4647 

0  0289 

16  l 

AwpO 

-1  2243 

0  1176 

10  4 

m 


Saaret 

R$S 

a, 

MSS 

f 

169  141  | 

2 

W*  5 

1  38 

Residua* 

96  8154 

1  40 

|o  6751 

Tola! 

2SS  7584 

1  70 

jl  5044 

Regress ion  R e su 1 1 s 


Variant  Cm! 


Sid 

Errar 


El 


Predicted  vs.  Exp 

60 

?  30 

* 

&  -oo 


r*“  62  2"**  =  1?1  S=  0  7583 

Residua!  vs  Predicted  Residual  vs  Proto 


Const  art 

np6 

m.f»6 


To  1576 
10  148 


0  8342 
1  Q  588 

6  2  b  56  I  !  5 1  f- 


- 1 - 

Predicted  vs  Exp 


J  3 


Analj^sis  of  variance  Table 

__ . t . 

•s  **'-  j  ft  7*5  79  T  * 

Ml*  6*1*  j 


Stare* 


!  To 


RSS 

7*5  79 

2834  4 

O  5  7  3  7 


hss  i 

j 4  8*5i  Tee  e  t 

jot",,6j  | 

!  0  8035  j  ! 


I7>i  r4  h?  n*4V  -j:  S--  o  ?•>&* 

Residual  vs  Predicted  Residual  vs  Proto 


,  0 

>-  I 

i  1 5 

; _ ^  .  _  |,s; 

i  300  • 

:  . ^  \00  j 

1-15 

''  '  1  »  5  ■ 

'''  1 

0  2  4  6 

fxperwnefttal  tog  Koc 

-0.0  1  5  3  0  4  5 

predicted 

•»  5  00  15 

number  of  standard 
deviations 

J  50  3  00  i  50  1  <*• 

1*  per  mental  toqUoc  predicted 

-Of:  (  4  0  2  <■  (i 

C-t  SI iinJan] 
ar*Ut**iS 

*  Fite  Edit 

Go  Print 

i  Use. 

4  File  Edit  bo  Print  Misc. 

2)  <P  & 

Koc  Triazines 

STATISTICS  Class:  Koc  pcb. 

Regression  Results 


Analysis  of  Variance  Table 


YarUfcle  Caaf 


Sid 

Errar 


j  Constant 

5643 

0  7221 

•  781 

<> 

bpl 

0  6296 

0  0919 

6  85 

JilvpO 

1 

9382 

0  3763 

•2  49 

2 

Stare* 

RSS 

df 

MSS 

F 

Regression  1 

1  60125 

2 

0  POOS 

[32  8 

Resnlual 

0  221318 

9 

2 

Total 

1  8226 

1  1 

0  0246 

i 

r2=  87  «** 

*Ws  ~ 

1  2 

S  - 

0  1568 

Variant 

Caaf 

Std 

Errar 

1 

Constant 

0  001  8 

1  262 

0  001 

O 

nr.3 

3  3677 

0  799 

4  21 

Is 

Analysis  of  Variance  1  able 
i  are*  CSS  df  HSS  f 


j  Regress  Km  ! 

-  9—~rr~ . 

j  i  84)5 

Ji  T  T  j 

Resrltjal  j 

0  618298  6 

0  103  7 

j 

|  Tola)  j  2  35  9  7  68  |  6 

1. . ] 

t7-  78  f  ** 

%** 

s  -• 

0  3<i'C 

Predicted  vs.  Exp 


Residual  vs  Predicted  Residual  vs  Proto 


Log  Koc  vs  fICl 


Residual  vs  Predicted  Residual  vs  Proto 


1  27  - 

5 

0  15  - 

•  -  ^  015 

e  56  • 

1C 

i  +- 

e-°”i 

o  ?5 1 

X3 

J 

f 

•  iH 

f.  *  “ 

21  24  27  SO 

E*per  mental  log  Koc 

2  1  2  7 

predicted 

-1  25  -O  75 

number  of  standard 
derutiona 

14  1  5  1 

nc3 

6  l  7 

48  52 

predicted 

0  •  5  0  4* 

'■un-ber  of  standard 
M*WlNtt] 

* 

File  Fdil  Go 

Print 

MISC 

« 

File  Edit 

Go  Print  Misc. 

ass: 

Koc 

STATISTICS 

i  mmmmmmmm 

ClaSS:  Koc 

Halogenated 

Aromatica 

Stare* 

RSS 

df 

MSS 

F 

§ 

1 

a 

|24  1463 

It 

|24  15 

Residual 

11  6001 1 

13 

0  1231 

1  86  | 

Total 

|25  74641 

[1  4 

1  839 

r2=  03  8% 

IW*  s 

1  6 

S  - 

0  3508 

Log  Koc  vs.  MCI  Residua)  vs.  Predicted  Residua)  vs.  Prob. 


SM 


Vafiakle 

Caaf . 

Errar 

« 

Stare* 

RSS 

df 

MSS 

f 

Constant  j 

-  Z045  1 

0  5708  j 

[-  358 

5 

Regression 

2  03980 

1 

2  0398  ] 

31  6 

npl 

0  741  1  j 

1 5  62 

Residual 

0  51 5841 

8  ; 

0  0645  j 

0  1318  j 

Total 

2  555641 

a  ! 

0  ?**  J 

_ 1 

i _ 

2 

r2- 79  0V 

»w*  = 

1  0 

S  = 

0  2639 

Log  Koc  vs.  MCI  Residua)  vs.  Predicted  Residual  vs  Proto 


4  6  8  10  2  30  3  73  3  00  6  23 

npl  predicted 


*  Hie  Edit  Go  Print  Mtsc. _ 


file  Edit  Go  Print  Misc. 


STATISTICS  Class:  Pv  pah* 


_ Regression  Results 

Sid 

Cml  frr«r  t 

CorwUnt  |a  7188  Jo  4656  ji8  7 
npl  1  4892  0  0702  -21  2 


Analysis  zl  Variance  Table _ 

Saare*  RSS  df  HSS  _ f 

R<*g'«sswn  1 3S  SOSO  J 1  Ii  6  6  |4 SO 

Residua)  0  71  2338  9  0  0791 

Total  3ti  30733  1  0  3  6307 


file  Edit  Go  Pnnl  Misc.  \*> 


STATISTICS  Class:  Pv  Halogenated  Aromatics 


Regression  Results _  _  Analysis  of  Van  anee  T  able 


Regression  Results 
Sid 

Vintllt  Cef  irr»r  t _  Seerce  RSS  _ d  f  MSS  I 

Codklart  8  822  0  340  19  RcKy  ♦?*««*,  t 4  7  1  *9  ”  :  p4?  ~[«94  ! 

bpi  1  56  8  7  O  0  701  22  2  Rewduaf  7  1 4  7  80  2*  iO  ?9  7fi  !  ; 

Tof^l _ 1  % «  3Jf>  8  2  0  U  1  2  ?  |  j 

*64  2  0  6  C 

togpv  vs  MCI  Residual  vs  Predicted  Residual  vs  Prob 

£  00t  J  000'- - - -  $««! 

s-i 0  :  \  ■  ij 

"*•  -  *  1  *jO  *  1  Ur  i 

—4 - 4 - -♦ - -4- -  ♦ - - *•  - . - *  . -  •  .f 

30  4*5  to  7*5  *,0  00  2'.  ■! 

•  *  •  1  <  ■ »,:  ri 

fapl  P**<tV.t**J  0*v  is'K'ri 


tog  Pv  vs  MCI 


Residual  vs  Preil..  j  Residual  vs  Prob 


s  s  7  e 

r»l 


File  Edit  Go  Print  Misc. 


STATISTICS  Class:  Kow  Universal 


IUB2K-3 


File  Edit  Go  Pnnl  Misc.  f 


STATISTICS  Class:  Kow  Ureas 


_ Regression  Rasul  is 

Sid 

¥ar1eSU  Cwif  ErrT  J _ 

CurttUt*  1  1 52  7  |0  1411  [i  1  7 
vpO  0  3978  0  01 75  22  7 

AGpO  •  J  9843  0  1234  -18  1 


_ Analysts  of  Variance  Table 

Sard _  RSS  d,  MSS  F 


Reqresaioii 

335  264 

2 

168  31  S 

RcsiOual 

1  02  8C9 

1  93 

0  6327 

Tolat 

438  073 

1  95 

2  2465 

r2-  7 e  s«. 

%*>,  - 

1  96 

S  =*  0  7299 

Analysts  ol  Variance^  ab I e_ 
S«8rc<  RSS  df  MSS  f 


Predicted  vs.  Exp. 


0  2  4  6 

I09 


Residua!  vs  Predicted  Residual  vs  Prolt 

-  1  «  I  ...  »  I  25  i 


_ Regression  Rasul ts 

Std 

V«n«H4  C—f  Irrir  t _ 

Conctanl  3  9333To  2288  PTTT" 
v p  1  1  2SB4  0  066 1  22  5 


Log  Row  vs  MO  Residual  vs  Predicted  Residual  vs  Prob 


R<vj<eG4»jr>T’ 8  8269  M 

j  18  83  '  |f.C4 

Rps*1u«a<  jO  1866  7  sU 

0  0373  | 

Tola*  ]  >9  C  1  6S  /]  6 

.  i=>  ,e4J.i 

r2  99  0  •  0*4*.  ? 

S  ••  c  18  3 

•  c  000  t 

1  ] 

2  <>  2V)  4 


2  4  6 

prsOvlKi 


■15  00  IS 
n«rrt»r  of  stindard 


2  S  4  S 
ypl 


•  2  so  0  on 

pr«hei*6 


•0  75  0  n 

ntjmtxw  of  standard 
4*  natrons 


4  file  Edit  Do  Print  Misc. 


STATISTICS  Class:  xow  PCBa 


file  Edit  Go  Print  Misc. 


STATISTICS  Class:  Kow  pah* 


Regression  Results 


Variant  Caaf  Errar  t 


Con«an» 

3  664 

0  1888 

ia  9  5 

bp4 

9026 

0  2587 

3  62 

bp» 

5  831  3 

0  6189 

8  42 

2 

Anelgats  of  Variance  Table 
Stare*  RSS  df  HSS  f 


Regie**  ton 

8  80627 

2 

4  8031 

167 

Residua) 

0  382424 

1  3 

0  0284 

Total 

10  18068 

1  5 

0  6792 

r2=  99  !•» 

*Ws  = 

1  6 

8  = 

0  1715 

Regression  Results 


Variable  Coat  Errar  t 


Constant 

0  7812 

0  218 

3  68  Oj 

npO 

0  4103 

0  0206 

18  8 

_ d 

Analysis  of  Vsrtanc*  Tobin 
Saarce  RSS  d,  HSS  F 


Rogfession 

'  0  9031  1 

[10  80  1 3  86 

Rr*xlua! 

0  30369?  1 1 

0  0276 

Tola) 

1  t  20569  t  2 

jO  9339  ! 

T2-  8  7  3V 

n*4>*  -  1  3 

S “  0  15«i 

Predlc’ed  vs.  Exp. 

6  75  r 


*50  400 


Residual  vs  Predicted  Residua!  vs  Prob 


Lag  Knw  vs  MCI 


Residual  vs  Predicted  Residual  vs  Prob 


*  50  6  00 

predict*'} 


-1  0  > 

nfnbf  of  standard 
4»'U1wu 


File  Edit  Go  Print  Misc. 


STATISTICS  Class:  Kow  Aniline* 


Regress!  on  Resul  ts 


Vurfstl*  Oif.  Error  t 


C«i*tar< 

■1  JIM 

0  J9S1  III  {> 

bpi 

1  0828 

0  0879  123 

4up6 

3  8803 

0  630  7  6  31 

5 

Analysts  ef  Variance  Table 
S— res  RSS  dr  HSS _ F 


FtogxMSion 

18  6293 

? 

8  26$  1  7  2 

FUtwdual 

0  849843 

1  2 

0  0638 

Totai 

18  1  7614 

14 

1  3«8  7 

r2- 86  6% 

1  5 

«=  0  2322 

Predicted  vs.  Exp. 


1J»  230  375  900 
•xMrVmnlal  Iff 


Residual  vs.  Predicted  Residual  vs  Prod 


8  10  12  14  4  5  6  1  0 

rxmfear  of  Stan 

npO  predict**  «*vUtwn* 


File  Edit  Go  Print  Misc. 


STATISTICS  Class:  Kow  Alcohols 


Regression  Results  Analysis  of  Variance  Table 


_ Regression  Results 

Std 

Vsrtsils  Cm f  Errw  t 

Corwtanl  -1  861  » jo  1281  I  16  3 
bpi  1  1666  0  0364  32  8 


Log  Kow  vs  MCI  Residual  vs  Predicted  Residua!  vs  Prob 


Rng«?Mtun 

6  03456  1 

6  0348  1  085 

HoMTnal 

0  01 3924  3 

0  0048 

Total 

6  048484  4 

1  2621 

f2- 88  7% 

*Ui  -  6 

8-  0  0681 

rurter  of  standard 
WtW«po» 


500  S  75  450 
bp) 


I  50  SOO 
prrdidH 


-0  75  0  OO  0  75 


Hie  Idit  Go  Print  Misc. 


Hie  Idit  Go  Print  Mist. 


0T-I  Jh 


STATISTICS  Class:  H  Universal 


STATISTICS  Class;  hpcbs 


_ Regression  Results _ 

Sid 

Variable  C—f  Lrr*r  t 

Const  a  ri  0~0C03  C  2O96To~00l  O 

njt«J  2*96  0  0422  j  5  91 

4vp1  •  186  ?  0  2082  I  -  B  0  1 


Analysis  of  Vnriance  Table 


Regress! on  Resu \ to 


S««fc«  RSS  d 

Rfi^tfiSSwn  S3  9802  2 

Rpjwduai  4  6  9296  7  3 

Total  t  39  9098  7  5 


HSS _ f 

4  6  89  [74  7 


Z  4o  89 

[73  0  6292 

8  j  7  5  1'  Bt  SS  _ 


Y<n>H«  C««(  f  rr#r  t 

Const  aid  -2  4;?  0  099  j  -  2s 
/ipc6  0  3676  0  024  7  |  14  9 

«pc4  1  1556  0  0885  12  8 


**  ~T?> 

4  9 

1  2  8 

_ _ 2 


_ Analysis  of  Variance  Table 

S**r«e  RSS  d#  MSS  I 


PURR'S* ion  I 

?*.:  i7"T7 

b  *v.*  •  • 

Rokxluai  j 

1 76604  1 J 

10  0146 

[  Total  j  3  4  3  66641  1  4 

[0  24 C 6  [ 

r  2  =  a  4  0  *, 

**•*»  1  6 

£•01  .'-0  9 

Predicted  vs.  fcxp 


f  xpwTnentjl  Wyj  H 


Residual  vs  Predicted  Residual  vs  Prob 


■  i  i;5 

numU-r  of  standard 
deviations 


•60  50 

predicts 


Predicted  vs  Exp 


-34  - 1 1, 

Cxpermental  toijH 


Residua)  vs  Predicted  Residual  vs  Proti 


file  Edit  Go  Print  Misc. 


STATISTICS  Class:  h  pah* 


Regression  Results 


File  Edit  Go  Print  Misc. 


STATISTICS  Class.*  H  Hatogenated  Aliphatic* 


t»n<ll«  C*«f  Err*r  J _ 

Const ant’To 8641  To  2958  2  89  "* 
rtiO  •  519  7  0  04  95  -I  0  5 


Analysis  of  Variance  Table 
$**rc#  RSS  df  HSS  f 


Recession 

0  687162] 1 

0  6872  1  1  0 

Residual 

0  02  4924  ]  4 

0  0062 

Total 

0  71 2C66i5 

0  1424 

ra- 9®  5% 

l*»*s  =  6 

S  =  0  0789 

_ Regression  Results 

Std 

Variable  Ceef  Err*r  t 

Constant  0  7325  0  261 2  2~8 

vpl  -1  2237  0  1559  -7  85 

np2  0  6848  0  0961  7  13 


Analysts  of  Vorlance  Table 


S*«rc* 

RSS  df 

MSS  t 

j  Rn^tfission 

«  39442  [2 

3  1972  1 33  9  j 

j  Residual 

1  80216  ]  1 7 

C  094? 

[  Total 

7  89658  j 1 9 

0  4209  ! 

r  2  -  a  0  o'": 

b*4s  --  2  0 

S -  c  2270 

Log  H  vs  MCI 


50  55  60  65  70 

rp) 


Residual  vs  Predicted  Residua!  vs  Prob 


-0  75  o  75 
number  of  standard 
deviations 


Predicted  vs.  Exp 


Residual  vs  Predicted  Residual  vs  Prob 


a  000000  - 

k  OOOOno  . 


-2  7  -2  1 

pr*<het*d 


-1  50  OOO 

Experiment*!  togH 


-15  -0  5 

pr«f ict*  a 


File  Edit  Go  Print  Misc. 


STATISTICS  Class;  BCF  Universal 


File  Edit  Go  Print  Misc. 


STATISTICS  Class:  bcf  pcb« 


Regression  Results 


Variable  C*ef  Err*r  t 


Cor*1ar4 

0  9816 

0  2329 

*ii  0 

np  t 

0  4347 

0  0415 

10  5 

AvpO 

-  . 

1  9999 

0  2086 

•0  93 

O 

Analysis  of  Variance  Table 

Saerc*  RSS  df  MSS  F 


j  Regression 

60  1143 

2 

30  06  61  6 

|  Residual 

32  71  78 

67 

0  4083 

l  Total 

82  8321 

68 

1  3454 

r2=  64  8% 

•w*  = 

70 

S=  0  6888 

_ Regression  Results 

Std 

V*r1*b!e  C**f.  Err*r  t 

Constam  X?  5756  jo  6252  Vi  2  1 
nch6  21  1 62  *  009  -5  20 


Analysis  of  Variance  Table 
Sure*  RSS  df  MSS  r 


Regie**  ion 

0  64  t  375  1 

0  6414  1 2  7  9 

Residual 

0  1381 26  6 

0  023  ! 

Total 

0  779501  7 

0  1114  j 

r2=  82  3*t 

*Ws  =  8 

S  =■  c  1617 

Predicted  vs.  Exp. 


12  3  4 

Expgr  tmafrtal  Kxj  BCF 


Residual  vs.  Predicted 


2  3  4  3 

pradletad 


Residual  vs.  Prob 


-1 .23  l  25 
raanter  of  standard 
deviations 


Log  BCF  vs  tic  I 


0  1425  0  1575 

ncb6 


Residual  vs.  Predicted  Residual  vs  Prob 


-0  75  0  75 

number  of  standard 
derwtwns 


40  4 2  44  46 

predicted 


File  Edit  Go  Print  Misc. 


STATISTICS  1  lass:  BCF  Hatogenated  Aliphatics 


File  Edit  Go  Print  Misc. 


STATISTICS  Class: Koc  Sabiijic  1988 


File  Cdit  Go  Prim  Wise. 


STATISTICS  Class:  Koc  Sabijic  i9«a  p*  <* 


A  file  CcHt  Go  Print  Mist. 


STATISTICS  Class  :s  Gersti  iss/  Atcoheia 


File  Edit  Go  Print  Misc. 


STATISTICS  Class:  Kow  Geratt  1987  Alcohol* 


Gerstl.  Z..  and  C.S.  Helling  1987.  Evaluation  of  molecular  connectivity  as  a 
predictive  method  for  the  adsorption  of  pesitcides  by  soils.  J.  Environ  Sci 
Health,  B22(1 ),  55-69 


Hie  Edit  Go  Print  Misc. 


STATISTICS  Class: s  Alcohols  [3  mci.j 


Nirmaiaknhndan,  N.N  .  and  RE  Speece  1988  Prediction  of  Aqueous  Solubility  j<>f 
of  Organic  Chemicals  Based  on  Molecular  Structure  Environ  So  Techno!  j  ! 
2?  3  p328-337  Lryj 


_ Regression  Results 

S  M. 

Variant  Cast  Error  t 


-0  56 

0  487 

§ 

_ 

S' 

Analysis  of  Variance  Table 
Saarca  RSS  df  MSS 


Regression  Results 
Std 

Variable  Ceaf  Irrar  t 

constat*  *  52 
npl  -40*0 

vpl  2  00$ 

vu3  0  18b 


Analysis  of  Variance  T nb !e 
Saarca  RSS  df  MSS 


df  M5S 

"T~~T 


File  Edit  Go  Print  Misc. 


STATISTICS  Class :  s  Universal  from  Kow 


File  Edit  Go  Print  Mtsc. 


STATISTICS  Class:  s  Aieohois  irom  ko»  »  mp 


PI 


_ Regression  Results 

St4 

Variant  Caaf.  Errar  t 

Constant  o  73800  0  15tl  4  80 

log  Kow  1.1340  0  0308  -20? 

M  P  -25  0  01 


Analysts  of  Vertance  Table 

Saarca  B$S  dr  MSS  f 

Regressor*  442  478  1  447  863 

Residual  80  696?  1/3  0  6t84  7? 


Regression  Results 


Variant  Caaf 

Constant  [o  /4  7v 
log  Kow  -1  06/3 

M  P-25  0  01 


Std. 

C— f .  Crrar  t 

I  74  1v  jo  0081  7  03 

1  06/3  0  0308  2/7 

0  01 


Analysis  of  Verience  Table 
area  BSS  d#  ftSS 


Saarca  BSS 

Reo*«***wn  6  734/9 
Residual  0  036340 


ms  r 

0  7340  /4i 

0  000  00!) 


Krmtti  fa  watts  af  mol/L 

Log  S  vs.  Log  Kow 

23  r 


r7=  03  1%  176  s  =  o  7201 

Residual  vs.  Predicted  Residual  vs.  Prob. 


15  30  4.5  6  0 

tog  Kow 


-60  -30 

or*4Vit*d 


tw*t  «f  *tsrxj*rd 
dsvtotton* 


Rrsans  is  watts  af  .nol/L 

Log  S  vs  Log  Kow 

0  00  Tv 

-0  75  •  \v 
"*  -2  25  \ 


I  50  3  00 

tog  Kow 


r2=  09  I*a*s  -  «  S-  0  0063 

Residual  vs.  Predicted  Residual  ve  Prob 


-3  00  -150 

pradtetad 


04  06  OS  10 
af  rUrwtor* 
<to'pl4ttons 


PI 


Ftle  Edit  Go  Print  Misc. 


STATISTICS  Class:  S  Anilines  from  Kow  A  Mp 


Regression  Results  Analysts  of  Variance  Table 


SM 


File  Edit  Go  Print  Misc. 


STATISTICS  Class  :s  Csrbsmatee  from  Kow  A  Mp 


Regression  Results 


Anatusls  of  Variance  Table 


VarlaMa 

Oaf 

Error  t 

Saarca  RSS  d# 

mss  r 

|  Constant 
(log  Kow 
Imp-25 

0.0078 

•  0738 
-0.01 

0  7887  0.01 

0  1045  -8  84 

6 

Regression  20  4460  i 

Residual  4  46113  1/ 

20  44  78  t 

0  261831 

{ 

trwKt  ta  watts 

n»of/i 

r2=  8?.i%  =  ts 

8  =  0.611 / 

file  Edit  Go  Print  Misc. 


f  lie  Edit  Go  Pnnt  Misc. 


STATISTICS  Class:  S  Haiogenated  Aliphatic*  trom  Kow  &  Mp 


a  a  & 


STATISTICS  Class:  s  Nonhatoganated  Aliphatic*  item  Kcw&Mp 


Regression  Results 
S14 

Cjgf.  Errer  i 


Constant 
fog  Kaw 
MP?5 


0  735* 

1  1600 
0  Ot 


0  3203 
0  0803 


2  23 
-13  1 


Saarc* 

RSS 

Of 

HSS 

f 

Regies***, 

?«  oeae 

1 

/«  oo 

1  !  \ 

ftewaual 

16  a ?9f 

36 

0  166663 

Regression  Results 


_ Analysts  of  Variance  Table 


Kes«Hs  Si  Mits  *f  mylJL 

Log  S  vs.  Log  Kow 


83  n«a*  =  3/  s~  o  eeji 

Residual  vs.  Predicted  Residual  vs  Prob 


vintiii 

Cwf 

Errar 

t 

Const  in 

0  73*6 

0  14BO 

4  06 

tog  Kow 

t  1163 

0  0366 

30  V 

MP  25 

0  01 

vr  :)/Vr"\ 

t!  y 0/086  ! 

c 

..HSS_ . f 

*  :  jwv 


la  watts  #f  "»o»t 

Log  S  vs  log  Kow 


-1  5 

w  -3  0  ■ 

^V,  r 

A.  » 

2 

0 

2 

r 

* 

-225 

^  -3  00 

/ 

/ 

,  02  4 
*  01  ■ 

1-4  5 

N.  ! 

. .  *  u 

?  -375 

*  00  •* 

-60 

d 

-2  • 

*  -2 

-450 

V  4  * 

2  3  4  3  6 

log  Kow 


-6  0  -4  5  -30  -15 

praCtcl#* 


-2  50  0  00  1  25 

r*jtr<b*r  of 

df»iatiori5 


225  375 

tog  Kow 


r  •  v*  »"*•  :  tu  s-  c  tic.- 

Residual  vs  Predicted  Residual  *s  Prob 

,  0  ? 

*  0 1 

— — . - . - .  *  GO 

d  C  t 


-4  50  -  J  00 

pp*d*ct*c 


-oo  c  4  c  e 

•**nt*r  of 

Orruhwvi 


*  file  Edit  60  Print  Misc.  (2)  <3  A 

4  File  Edit  Go  Pnnt  Misc. 

(2  <P 

m 

STATISTICS  ClaSS  ”.S  Haiogenated  Aromatics  from  Kow  &  Mp 

STATISTICS  Class  :s  pcs»  from  kow  «,  mp 

Regression  Results 


Variable 

Caef. 

Errsr 

t 

Constant 

0  4025 

0  ?«  j 

1  1 

o 

log  Kow 

•  1  0820 

0  0703  ! 

-  16  4 

. 

MP-zS 

-0  Ot 

| 

§ 

Sasrca 

RSS 

Of 

MSS 

f 

1  Regtemon 

134  264 

1 

134 

23  7 

32  2310 

6/ 

0  5654  56 

Regression  Results 


«trs«tts  to  wilts  of  moUI. 
Log  S  vs.  Log  Kow 


<n  -3.0 


r2:  M  **•>  lt^s=  6  0  g=  0  7520 

Residual  vs  Predicted  Residual  vs  Prob 


Variable 

Caef 

freer 

t 

Constant 

t  091  7 

0  76  7 

2  5 1 

O 

tog  Kow 

-  »  3701 

0  103 

0  6? 

MP  25 

0  Ot 

o 

Analysis  of  Variance  Tab! a 

d*  hss  r 


8*witts  to  Vttts  *f  mot/L 
Log  S  vs  Log  Kow 


r2:  81  «*.'  rUi  «  S  --  Q^tlK 

Residual  vs  Predicted  Residual  vs  Prob 


1.5  3  0  4  5  6  0 
teg  Kow 


10  12 
numttr  of  d jrxljrd 
dorvitwm 


4  5  50  5  5  6  0 
Gg  Kow 


-oo  o«  o e  i7 

r*wrnb*r  of  I'jrv}*-  ■> 


File  Edit  Go  Print  Misc. 


STATISTICS  Class  :s  pah«  trom  kow  *  iip 


o  &  & 


File  Edit  Go  Print  Misc. 


■SL& 


STATISTICS  Classes  u™»  from  Kow  a  Mp 


STATISTICS  Class:  Kow  Universal  from  S  *  Mp 


STATISTICS  Class: kow  Aieohoi*  irom  s  *  mp 


File  Edit  Go  Print  Misc. 


STATISTICS  Class;  Kow  Anilines  Irom  S  &  Mp 


_ Regression  Results  _  Analysis  of  Variance  Table 

SW 

V>rii»U  Cjgf.  Error  t  S «ar«  RSS  dr MSS  f 


f He  Edit  Go  Pnnl  Mist. 


STATISTICS  Class:  Kow  Carbamates  from  S  A  Mp 


Regress! 


on  Results 


Corel  a  rtf 

0  4119 

0  2418  1  /  O 

log  S  (mct/l) 

•  8889 

0  1006  8  84 

MP-2S 

-  000889 

O 

Reg»es»*oo  19  6  729 

19  67  /0  1 

Residual  4  28306 

1  7 

0  251944 

Variable  Ce*f  trrec  J 

Curuiait  o  wo*>  o  j«#64  o  n 
tog  S  (moWl)  86»>?  O  1296  6  *. 

MP  2S  00866? 


Results  m  writs  *f 

Log  Kow  vs.  Log  S 


r?=  n**s  =  io  s-  01,010 

Residual  vs  Predicted  Residual  vs.  Prob 


0  3964  0  fun  \  t  R «*g< «***»;* t  ty»j 
O  1296  6  60  PU‘wrJu#t  1  T? 

f 

_  <>j  - - 


Analysis  of  Variance  laCile 
Swte  RSS  dx  . f 

^l-gn*****!  tO»iV«»  1  to  60 

4»‘wrju*l  1  i*4jft  *)  y'Hifi  -'J  j 


Results  ta  Witts  •< 

log  Kow  vs.  Log  5 


r  *  »o  v*.  n**s  / 


Residue!  vs  Predicted  Residual  vs  Prod 

, 05 1  ;  cs{ 


*5  00  -2  50 

logs 


t  .25  2  50  3  75  5  00 
predicted 


-t  50  000 

fHjrr^er  of  standard 

drwUtwnt 


«  2  3  4 

pr«dKt«<] 


to  C  0  0  5 

i+rrb»t  Hi  i*#*nr> i 
e»»-wi*orit 


m 


*  file  Edit  Go  Pnnl  Misc. 


STATISTICS  Class:  Kow  HalOB«n*ted  Aliphatic*  from  S  A  Mp 


File  Edit  Go  Print  Misc. 


STATISTICS  Class :  Kow  Nonh»logen»t»<J  Allphallct  from  S  A  Mp 


Reqresslon  Results 


SW. 

Variable  Caef  Errar  t 


!  1087  0  ?  5  55 

log  S  (motflt  •  7089  0  054?  -13  l 

M  P-25  ■  007090 


Analysis  of  Variance  Table 


RSS  dr  MSS 


j  48  5377  1  140  64  |I71 

jRe««1ual  0  90900  I  36  j  0  2831  141 


Regression  Results 


Sti 

Variable  C«ef  trrar  t 


Corulanl  0  68  79  0  iiifi  J «  t  / 

tog  S  (moi/L)  •  o»e?  o  o?o 1  30  1 
M  P  25  •  00080?  i 


Analysis  of  Vorionce  Table 


Sea  re*  RSS  dr  MSS  f 


Heg*0M«3n  9  050(0  1  0  0504  |w?8 

Rrcwlual  0  07/964  6  0  009  7461 


Remits  i*  wilts  if 
Log  Kow  vs.  Log  S 


V*-  03  C<H,  n**s=  37  s :  0.5321 

Residual  vs  Predicted  Residual  vs  Prob 


tr-H-.  - -rt.  .(  '*  "•»«  =  S^IM 

Log  Kow  vs  Log  S  Residual  vs  Predicted  Residual  vs  Prob 

5  23  TV 


nienber  of  ftindard 
drvutiorc 


«  File  Edit  6o  Print  Misc. 


STATISTICS  Class:  Kow  Halogenated  Aromatic*  from  8  &  Mp 


Regression  Results _  _ Analysis  of  Variance  Table 


Std. 

Vfiabla  Caaf.  Error  t _ 


Corwtani  1  1158  0.1963  6  68  KM  Hegf««wn  92  3245  1  92  32  23 

tog  S  < mol/L)  -  744  7  0  0403  -15.4  Residual  2?  163?  5  7  0  380820 

MP-26  -  007447  1 — 


Saerc*  RSS 


«  File  Edit  Go  Print  Mlsc. 


STATISTICS  Cla55:  l^oc  Unlveraal  from  S 


Regression  Results  Analysis  of  Variance  Table 


std. 

Vartabl*  Caaf.  Error  t  Saerc* _  RSS 


Conctart  0  955372  0  1653  5  70  {> 

tog  s  (mol/L)  0.58564  0  0421  -13  8  jRewdual  1 23  8386  |66  jo  36119  I 

M  P -25  006856  — 


Results  b  tabs  «f 

Log  Kow  vs.  Log  S 


T*~  80  6%  1W  =  59  8=  0  6236 

Residual  vs.  Predicted  Residual  vs.  Prob 


Results  t*  writs  «f  Ukq 

Log  Koc  vs.  Log  S 

fit 


j  Regress  on 
|  Residual 

70.0563 

23  8386 

70  06 

0  361  19 

194 

f2=  ;i«% 

n*b»  = 

68 

S  = 

0  60» 

Residual  vs  Predicted  Residua!  vs  Prob 


-60  -5  0 

logs 


230  303 

pr*<fte4»d 


-125  0  00  1  25 


File  Edit  Go  Print  Mlsc. 


STATISTICS  Class:  KOC  Kaloflaoatod  Aromatics  Irom  S 


File  Edit  Go  Print  Mlsc. 


STATISTICS  Class:  Koc  pah*  from  s 


File  Edit  Go  Print  Misc. 


File  Edit  Go  Print  Misc 


STATISTICS  Class:  Koc  Ur«as  Iron  S 


Regression  Results  Analysis  of  variance  Table 


STATISTICS  Class:  BCF  Universal  from  Kow 


_ Regression  Results 

SM 

Vlfjjtle  Cwf  irr<r  t 

COEWUm  1  1216  0  230  4  69 

log  S  (moi/l)  son  0  0639  6  13 


Shw  RSS  df 

Regienioo  603911  1 

Residual  2  58898  1  4 


MSS  f 

Te  0301  3/  0 

|o  »«40?; 


Regression  Results 


SM 

Venelte  C— f  1  rrer  t 

Corwlan  '”T"o!)»a?'  a  s«/8  1  ?o» 

log  Kow  0  880366  0  0481  18? 


Analysis  of  Vorience  Table 


Searc* . rss  . . dj  .  _.M3S 

Rt*g«<*i>*»r*i  **■.  0 1  vj  1  j46  0? 

Residual  t-  si  BO?  «'•  jo  i4i/o 

1  i  1 


KkwHs  «■  wits  «f  L/k® 

Log  Koc  vs.  Log  S 


r  17  8*?  n>fcj  -  |6  S  s  0  4  300 

Residual  vs  Predicted  Residual  vs  Prob 


XrwHs  it  watts  *f 

Log  BCF  vs  tog  Kow 


0  75  2.25 

PfCdKlfd 


-150  0  00 

numiwr  of  sltndv  d 
dfvUtWK 


250  575  500 

log  Kow 


r2  »■'  "*  <W  «*  s  0  •  «.« 

Residual  vs  Predicted  Residual  vs  Prob 

0  •.*  0  *  • 

0  0 -  00  ■ 

5  -0  5  -  ^-0  3 


1  7  5  4 

pretlKteC 


File  Edit  Go  Print  Misc. 


STATISTICS  Class:  *oc  Universal  from  Kow 


Regression  Results 


Venable 

Ceef 

SM 

Errer 

t 

Constant 
log  Kow 

1  Q0OI 
0.5801 

!0  1723 

0  0445 

[s  81  <> 

!  13  ? 

; _ o 

Analysis  of  Variance  Table 
arcs  RSS  dr  MSS 


File  Edit  Go  Print  Mtsc. 


STATISTICS  Clash  I  Koc  Anlllnee  Irom  Kow 


Regression  Results  Analysis  of  Variance  Table 


ReswMs  m  wits  «f  ‘-'*0 


Saarct  RSS 
R^giestfion  03  7441 
R«$<du0«  42.8716 

f-  2-  60  6%  nt|( 


mss  r 

#3  74  f  1  7b 

0  636803] 


Regression  Results 
SM 


Log  Koc  vs  tog  Kow  Residual  vs  Predicted  Residual  vs.  Prob 

.  500  |  ■■>'''  1  1  I  '  <  ! 


Varies!* 

Constant 
log  Kow 


Predicted  vs.  Exp. 


0  8605 

0  i  n 

4  06 

5j 

flflfltPUion 

4  30610  j  1 

4  3061  918  j 

0  5655 

0  060 

0  68 

Residual 

0  336 126  7 

0  04/8/5 

, 

Residual  vs  Predicted  Residual  vs  Prob 


-25  2  5  5.0 

V*Kov 


-0  0  1.5  3  0  4  5 
pr*dict#6 


-»  25  1  25 

ngmtwr  of  stsndard 


1  25  3  75 

log  Kow 


'  50  3  CIO 

pr*dte«*6 


-0  6  0  0 
nun-twr  «rf  (lands r4 
6t  yiatwns 


Fife  Edit  Go  Print  Misc. 


STATISTICS  Class:  Koc  Halogated  Aromatics  from  Kow 


Regression  Results 


SM. 

Virialli  CelErw _ t 


Corwtari  0.9838  o.ib  a  n 

log  Kow  0.6723  0  0451  12 


Analysis  of  Variance  Table 


File  Edit  Go  Print  Misc. 


STATISTICS  Class:  pah*  from  kow 


Regression  Results  Analysis  of  Variance  Table 


Saarce _ RSS 


j  Regro8*wn  142  0860 
I  Residual  0  85/04 


«a  mtHt  af  L/kg 

Predicted  vs.  Exp. 


r2=  81  3%  n«F 

Residual  vs.  Predicted 


mss  f 


4299  161 

0  26640/ 

•  =  30  S  -  O  5161 

Residual  vs.  Prob 


SM 


6J8 

0  8231  0  132Q  6  7 


1  Regression  1 14  1? 49  it 
]Re*duai  I?  9404?  8 


emits  Si  watts  af  L /kg 

Predicted  vs.  Exp. 


r2=  «?  8%  n*4«  =10  S  =  0  604.3 

Residual  vs  Predicted  Residual  vs  Prob 


12  3  4  5  ’ 

-1  50  0  00  1  50 

3 

4  5 

6 

■ 

3  4  5 

*  ’ 

-00  05  10  15 

ngniMr  of  sfmdwd 

S 

* 

mjmfcpr  «f  rljpndwd 

prwheted 

log  Kow 

pndtetad 

dtorUliont 

*  File  Edit  Go  Print  Misc. 


STATISTICS  ClsSS^Koc  Kerlckhoff  at  at  1979  from  Kow 


Kartckhoff,  S.W.,  D  S  Brown,  and  T.A.  Scott.  1979.  Sorption  of  hydrophilic 
pollutants  on  natural  sediments.  Water  Res  13:241-248 
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Hansch  C..J.E.  Quinlan  and  Q  L  Lawrence,  “The  Linear  Free-Energy 
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‘Environmental  Fate  of  Selected  Phosphate  Esters,*  Environ  Set 
Technol  .  13,  840-44  (1979) 


1 


Yalkowsky,  S  H  ,  R  J  Or  and  $  C  Vaivani.  “Solubility  and  Partitioning  3 
The  Solubhhiy  of  Haioben/enes  m  Water”  Ing  {,' ng  Chem  f  undam 
18  351  53  (1979) 


Regression  Results _ 

Std 

jfirjilU  Ceef.  Errer  t 

Conetant  1?  0 

Log  Kow  ■>  3H 

Results  te  witts  *f  #*noHL 


Analysis  of  Variance  Table 


Sserct  CSS 


Regress! on  Results 


r2=  66  6  n**,  : 


_ J  [*» 


Vsriebis 

CorwUri 
*  og  Kow 

Mp 


s»d 

CecT  Errer  t 


Analysis  of  Variance  Table 
Scarce _  RSS  _  Q t  MSS 


o  /i  /n  O 

o  u»/« 

0  0006 

_ U _ _ JoJ  - 1  -  - . 


ReseKs  to  eelts  *f  moiii 


r?-  vv  o  n. t 


file  Edit  Go  Print  Misc. 


File  Edit  Go  Print  Misc. 


STATISTICS  Class:  s  Mackay  and  Shiu  1977  PAHa  from  KowGMp  STATISTICS  Cl3SS:Koc  Kenaga  and  Goring  1976  Pesticide* 


Mackay,  0.  and  W  Y.  Shiu.  ‘Aqueous  Solubility  of  Polynuclear  Aromatic 
Hydrocarbons ,’  J  Chem  Eng  Data.  11,399-402  (1979). 
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STATISTICS  Class  :s  Kenaga  and  Goring  from  Kow 
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STATISTICS  Class:  koc  Brown  et.al  aromatics 
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STATISTICS  Class  IKoo  Briggs  1973 


Briggs,  G.  G.,  "A  Simple  Relationship  Between  Soil  Adsorption  of  Organic 
Chemicals  adn  Their  Octanol/Water  partition  coefficients,"  Proc  7th 
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STATISTICS  Class:  BCF  Kenaga  and  Goring  from  S 


Kenaga,  E.E.  and  C.A.I.  Goring,  "Relationship  Between  Water  Solubility. 
Soil-Sorption,  Octanol-Watcr  Partitioning  and  Bioconcentration  of 
Chemicals  in  Biota,"  pre-publication  copy  of  paper  dated  October  13. 
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STATISTICS  Class:  Kow  Universal 
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Analysis  of  Variance  Toble 
Sasrc*  RSS  dr  MSS  f 


Regression  Results 


Recession 

8  09101 

i 

8  0910 

62  4 

Residual 

1  03735 

8 

0.12966 

r2=  88  8% 

n»*s  = 

1  0 

S  = 

0.3601 

Variable  c— f 

Constant  0  65! 

TSA  0  021 

N-TSA  •  141 

OTSA  061 

AfNTSA _ Log! 


06555  03295  169 

0  0208  0  0014  14  3 
•  1 497  0  0276  -6  43 

0584  0  0095  -616 

.  0859  0  0177  -3  73 


Analysts  of  VarienceTable 
Ssttrc*  RSS  dr  MSS  J 

Rerjiptson  246  812  4  6^2  [Tfi  6 

Rnsidual  81  31  14  1  04  0  78184 


rss  df  nss  r 

248  812  4  6^2  [Tfi  6 

81  31  14  1  04  0  781  84 
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.  5  25  t  X" 


125  175 

TSA 


Residual  vs  Predicted 

0  50 

r 

0  25  •  « 

0  00  - - - -  ‘ 


3  00  4.50 

pK*dictPd 


Residual  vs.  Prob 

050  -t- 
025  4 


0  8  1.2  16  2  0 
nurtwr  of  standard 
deviations 


Predicted  vs.  Exp. 


Residual  vs  Predicted 


Residual  vs  Prob 


-0.0  2  5  5  0  7  5 
Exprrmmtjl  IpgKov 


2  4  6 

predicted 


-10  12 

rumbov  of  standard 
d*  via  lions 
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Analysis  of  Variance  Table 
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STATISTICS  Class:  Kow  pah* 


Regression  Results  4 


CornKtarti 

TSA 

-  2628 

0  025 

0  3757 

0  0016 

-  099 

16  7 

V 

111 

1 

A 

Regretsen  23  4248  1  23  42  278 

Recidual  2  52769  3  0  0  084  25 


r2sM.au  n«4s  =  32  8=  0  2903 

Experimental  vs.  TSA  Residual  vs.  Predicted  Residual  vs.  Prob 

»  e  +  ^  0  30  1  0  30  T 


•  1821  0.7424  -  218 

00243  00034  71 

-  1243  0.0255  -4.88 


Analysis  of  Variance  Toble 


Regro««o  22  01  77  2  1  1  01  26  7 

Residual  9  47765  23  0  41  207 


Predicted  vs.  Exp. 


to)  rJ:»9  9%  n^r  26  S=  0  6419 

Residual  vs.  Predicted  Reslduel  vs.  Prob 


200  240  200  320 
TSA 


-0  0  0  5  10  15 
nurrtwr  of  standard 
derations 


3  4  5  6 

E*por1m*nta1  togKcv 


5  4  5 

pr«die1*d 


-1  50  0  00 
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deviations 
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STATISTICS  Class:  Kow  Triazines 


Analysis  of  Variance  Table 


_ Regression  Results 

SM. 

V«ria»U  Ctf  Crrsr 
CcraUX  1.0307  1 0  6601  I-  066 
TSU  0  017  0  0031  6  43 


Experimental  vs.  TSA  Residual  vs.  Predicted  Residual  vs.  Prob 


Regression  Results 
Std. 


Regr  estion 

13.6070 

1 

rrTTBj 

29  6 

Residual 

7  39176 

1  6 

iilS 

r2=  64. 8% 

n^*  = 

1  8 

s  = 

0.6787 

Constant 

19868  13 

-1  64 

“1 SI 

TSA 

0  0204  0  0051 

3  96 

■ 

N-TSA 

0489  0.0116 

-4  2 

l| 

Analysis  of  Variance  Table 
Seurat  rss  af  nss 

Regr»s«*on  0  901201  2  0 

Residual  0  0*5630  2  0 


Predicted  vs.  Exp. 


r2=  93  2%  «***  =  6  s-  0  1812 

Residual  vs.  Predicted  Residual  vs.  Prob 


150  250 

TSA 


2  3  4 

pradtotM 


-OS  05  10 

ruT«*f  of  standard 
drr  tattoos 


24  27  30  S3 
£xp#rfm*nti1  logKov 


2  25  2  75 

pratftetad 


-1  200  - 1  050 

nuntfwr  of  standard 
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Regression  Results 


Std 

Van  4  Me  Cttf.  Error  t 

Constant  1-5  3722|o  3581  I  t  5 
TSA  0  0306  0  001  7  18  2 


Analysis  of  Variance  Table 
Sttrct  rss  df  mss  f 

Regression  18.7323  1  1 6  73  330 

Residua!  O  284226  5  0  05684 


STATISTICS  Class:  Pv  Universal 


_ Regression  Results _  _ jtaolijsis  of_Varience  Tnbfe 


_ Regres sion  Result s _ 

Std 

VtritUt  Cttf  Error  t  _ 

Constant  6  506  |0  5308  I  12  3 

TSA  0288  0  0028  - 1  0  1 

O-TSA  08  0  0227  3  52 


Source  rss 

mss  r 

Regie* s«or:  30  366 

!  2 

1 76  1&6  ? 

RetKluel  39  0  131 

!1?8 

3  0S62«j 

i 

I _ I _ I _ I _ £>]  rz=  98  5%  n*4«=  7  s=  0  2384 

Experiments!  vs.  TSA  Residual  vs.  Predicted  Residual  vs  Prob 


Predicted  vs.  Exp 


_ _E>]  r2=  47  4  ».  n***::  128  1?60 

Residual  vs  Predicted  Residual  vs  Prob 


100  150  200 

TSA 


-2.50  0  00 

predicted 


6  -1  4 

nurroer  ot  standard 
dtvaalwu 


-5  0  0  0 

Ex^srmenUl  Pv 


•1  5  00  «5 

nuntw*  of  standard 
de*»etK>n3 
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STATISTICS  Class.*  Pv  Nonhalogenated  Aliphatics 


Regression  Results 


std 

VarltMe 

Ctef 

Error  « 

ConsU’4 

5  4345 

0  2932  18.5  <*> 

TSA 

01  28 

0  001 7  -7.74 

Analysis  of  Varionce  Table 


54  ■  fee  RSS  df  MSS  f 


Regression  Results 


Analysis  of  Variance  Table 


Regression 

20  1520 

1 

20  15 

59  9 

Residual 

12  1  187 

36 

0  33663 

r2=  62  4% 

n«ts  = 

38 

S  = 

0  5802 

YtrftMe 

Constant 

TSA 


Std. 

Ctcf.  Errer  t 

6  7505  1 0  291 1  j  23  2 
-01  81  0  001  4  -12  0 


Ssurce  RSS  _ df 

|  RonrocsKXi  2’  2642  1 

I  Residual  2  41  075  is 


MSS  _ f 

f?i  26  fie e 
jo  t ?6eej 


Experimental  vs.  TSA  Residual  vs  Predicted  Residual  vs  Prob.  Experimental  vs.  TSA 


K>j  r^=eea*  =  21  s=  0  3662 

Residual  vs.  Predicted  Residual  vs.  Prob 


0  75  1  30  223 
number  of  standard 
deviations 
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Analysis  of  Variance  Table 


Sttrtt  RSS  df  MSS  F 


Regress! en  Results 


std. 

ViritlU  Ceef.  Crrtr  t 

Constant  pT 3422  |o  54  Il5  4 

TSA  -  0426  0  0026  -16.5 


Experimental  vs.  TSA  Residua)  vs.  Predicted  Residue)  vs.  Prob. 


150  225  300  373 

TSA 
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STATISTICS  Class:  Pv  pcb« 


Regression  Results 


1 ?  16  20  24 

number  of  standard 
dtvwtwru 


Regression 

182.189 

1 

182 

272 

Residual 

20  0981 

30 

0  66903 

r2=  so  1% 

IV*s  = 

32 

S  = 

0  8185 

Std 

VtritMt 

Cttf. 

Error 

t 

Conslart 

7  9828 

0  0208 

8  68  O 

TSA 

-  041  3 

0  0038 

•10  0 

0 

Analysis  of  Variance  Totola 


Store* 

RSS 

df 

MSS  F 

Regression 

47  2893 

rr^ 

4720  120 

Residual 

5  14347 

1 3 

0  39565 

r2=  90.2% 

n«**  = 

1 5 

5=  0  8290 

Experimental  vs.  TSA 


Residual  vs.  Predicted  Residua)  vs.  Prob. 


150  250 

TSA 


-5:0  -2  5  OX)  2  5 


-1.0  -0.5  OX)  03 
nmnitf  of  standard 
drriatfens 


200  240  2SO  320 
TSA 


-0  8  -0  4  0  0  0  4 

r*r rfeer  of  standard 
dsrlstton* 
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STATISTICS  Class:  Pv  pah« 


_ B8qi»s«ton  R«»ulti _ 

Std. 

VarUkl*  C««f.  Frr.r  t 
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STATISTICS  Class:  H  Universal 


Analysis  of  Vsriancs  Table 


CowtJ*  <S  1803  1  876  «  ss 

TS»  -  0748  0  OOOS  -7  68 


S48rc«  RSS 


Regression  5  4  5426  1 

Residual  0  64705  !0 


Regression  Results 
Std. 

It  Cetf.  Error  t 


1  8637  0  3740  4  0? 

•012  0  0010  -6  36 

-  0807  0  0183  -4.07 


Analysis  of  Variance  Table 

Stare#  RSS  d«  MSS  F 


Regression  207  04T  2  104  40  4 

Residual  36 1  605  141  2  56386 


Experimental  vs.  TSA 


_ Eg]  r2=asi%  rw 

Residual  vs.  Predicted 
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STATISTICS  Class:  H  Halogenated  Aromatics 


Regression  Results  Analysis  of  Variance  Table 


Regression  Results 
Std. 


Variable 

C*«f 

Irrtr 

t 

Constant 

0  629 

0  3242 

1 .94  0 

TSA 

-.01  03 

0  OOiS 

-6  88 

O-TSA 

0724 

0  01  1  7 

-6  1  9 

O 

Predicted  vs.  Exp 


S**rtt  RSS  df  MSS  f 

I  <5  ^Regression  124  1661  I2  12  08  [64  7 

>0  JH  •'•*,1  5  97456  3  2  0  186  70 

■*  _  I _ L_i_J _ _ J _ 

_ §  r2=  80  2%  n*br=  3  &  S  =  0  4321 

Residua}  vs  Predicted  Residual  vs.  Prob. 
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STATISTICS  Class:  Koc  Universal 


_ Regression  Results _  Anal 

std 

Variable  C#ef  frrer  t  S»»rce 


Variable 

C#*f 

Std 

lrr*r 

Conslart 

0  2238 

0  2634 

TSA 

0  0145 

0  0012 

N-TSA 

0409 

0  0134 

O-TSA 

0247 

0  0049 

AiN-TSA 

0445 

0.01 1 2 

Analysis  of  Varionce  Tobl e 

•re*  RSS  df  MSS 


S»>rce  _ RSS 

Royiesswrij  123  5  38 
Residual  I  1  2  7  4  20 


MSS  t 

30  a  jaw  a  i 

0  /8es4  1 


Predicted  vs.  Exp. 

5  t 


99  KH  r  ^  =  4 »  2 w  rut 

Residual  vs  Predicted 


-3  -2  -t 

Exfxr  rrxfitj]  H 


-3  -2  -I 

predicted 


-0  9  -0.3 

number  of  standard 
dfvWtk ns 


0  2  4 

Experimental  Koc 


1  2  3 

predated 


►*=167  S  =  C  6669 

Residual  vs  Prob 

2  3C 

1  75  ■  y 

0  00  - 
-125- 


-15  0  0  15 

nunrAer  cl  stared 
4*twi*ot.s 
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STATISTICS  Class:  Koc  Halogenated  Aromatics 


_ Regression  Results 

Std. 

V*ri*b1t  Cml  l  rr*r  t 


Constant 

0  1657 

0  3586 

0  462  O 

TSA 

0  0168 

0  0016 

10  3 

N-TSA 

-  0631 

0  019 

•3  32  rrr 

0  TSA 

-  .052 

0.0082 

-6  32 

AfN-TSA 

0519 

0  01  3 

-3  9?  0 

Analysis  of  Variance  Table 
S««rc*  RSS  df  MSS  F 
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STATISTICS  Class:  Koc  pahs 


Regression  Results 


Regiesson 

73  5330 

4 

1  8  38  39  5 

Residual 

29  6175 

64 

0  46569 

r2=rt  t» 

*Wbs» 

69 

S  =  0  6626 

Std 

Variable  Ctf.  Errar  t 

Cortttaif  0  8402  jo  6065  1  39 
O-TSA  1602  10  0182  8  81 

TSA  0  01  78  )0  0026  6  83 


Analysis  of  Variance  Table 
Saarc*  RSS  df  MSS 


I  Recession 

24  5865 

2 

1  2  20 

57  4 

'  RcskIua! 

2  57223 

1  2 

0  21436 

r2=  90  5  v,. 

n*ts  = 

1  & 

S  = 

0  4630 

Predicted  vs.  Exp. 


Residual  vs.  Predicted  Residual  vs.  Prob. 


Predicted  vs.  Exp. 


Residual  vs  Predicted  Residual  vs  Prob 


0  2  4 

Experimental  Koc 


125  2  50  3  75  5  00 
predicted 


-2-10  I 
r*jrr<*-r  of  standard 
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predicted 
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_  Regression  Results  Anal 
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STATISTICS  Cla5S:  Koc  Ureas 


std 

Variable  Caff.  Errar  < 

Constant  i- .6948  |o  5926  1-1.17 
TSA  0  0118  0.3021  5.5 


Analysis  of  Variance  Table 
Swf  RSS  df  MSS  F 


Recession 

0  809S00  1 

0 .80950  30  2 

Residual 

0.240918  9 

0  02678 

r2=  77.1% 

**»=  1 1 

S=  0  1636 

_ Regression  Results 

Std. 

Variant  C*ef  £rr*r  t 

1-3.1  631  It  147  | -2  78 
|TSA  0.0164  0  0028  5  54 

0.148  0  0681  2  17 

SA  0  0404  0  0148  2  73 


Analysis  of  Variance  Table 
Stare*  RSS  df  MSS 


RegfCMion 

8  381  17 

3 

2  7937  153 

Residual 

4  02460 

22 

0  18294 

rr^67  6% 

n#t*  - 

28 

S  ~  0  42  7  7 
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STATISTICS  Class:  BCF  Universal 


_ Regression  Results _  Anal 

std. 

Variant  Caaf.  E  rrar  t  Stare* 
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STATISTICS  Class :s  pah*  from  ut 


Analysis  of  Varieties  Table 


Stare*  RSS 


MSS 


7672  155 

0  48755 
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Transport  Parameters.  Thesis,  Utah  State  University. 
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